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ABSTRACT 


Dynamic  programming  is  employed  to  obtain  a  solution  to  the  problem 
of  controlling  a  nonlinear  system  in  an  optimal  fashion,    subject  to  a 
quadratic  performance  index.    The  technique  used  is  similar  to  that  given 
by  Merriam  and  Kalman  fur  Linear   systems. 

For  some   special  nonlinear  systems,    the  solution  can  be  computed 
by  direct  application  of  this  technique.    As  an  example,    the  optimal 
control   system   tor  a   freely   spinning  body  is  determined. 

For  more  general  nonlinear   systems,    the  solution  cannot  be  obtained 
directly.    However,    it  is  possible  to  obtain  a  solution  indirectly.    This  is 
done  by  first  Linearizing  the  vector-State  equations  representing  the 
nonlinear  system.    Next,    dynamic  programming  is  used  to  obtain  an 
approximate  solution  based  on  the  linearized  state  equations.    Then  an 
rative  procedure  for  improving  the  solution  is  presented.    It  can  be 
shown  that  it  the  iterative  procedure  converges,    it  converges  to  the 
t  of  the  optimal  nonlinear  control  problem. 

Computer  example  problems  are  given  to  illustrate  the  method,    and 
to  indicate  the  convergent,  e  that  is  usually  achieved.    In  addition,    the 
performance  oi  the  optimal  control  system  is  compared  with  the  perform- 
ance of  a  simple  sub-optimal  control   system  for  some  of  the  example 
probl(  ins  given. 
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CHAPTER  I 


SUMMARY 


1.  1      Introduction 


During  the  past  decade,    a  new  approach  to  automatic  control  has  been 
developed  principally  as  the  result  of  work  by  Bellman  and  Kalman 

in  this  country,    and  Pontryagin8    in  the  U.  S.  S.  R.     This  approach,    which 
is  now  commonly  called  the  ''theory  of  optimal  control  systems,"  differs 
from  the  now  classical  approach  to  automatic  control  of  Newton,    Gould, 
and  Kaiser,9     for  instance,    in  that  it  uses  a  vector  differential  equation 
description  of  the  system  instead  of  a  transfer  function  description,    and 
it  concentrates  on  the  time  domain  methods  of  analysis  and  synthesis, 
instead  of  frequency  domain  methods.    The  theory  of  optimal  control  has 
made  use  of  the  calculus  of  variation,7,10      and  the  new  but  related 
"dynamic  programming"  of  Bellman,    '  as  well  as  the  "maximum 

principle"   of  Pontryagin. 

Useful  results  of  the  application  of  these  methods  to  optimal  control 
problems  have  been  obtained  primarily  for  linear  systems.    Useful  results 
have  been  obtained  for  nonlinear  systems  in  only  a  few  very  special  cases.13,14 
It  is  the  objective  of  this  work  to  extend  to  nonlinear  systems   some 
techniques  that  have  been  successiul  in  the  design  of  controls  for  linear 
systems. 

1.  2      Notation  and  Terminology 

An  attempt  has  been  made  to  keep  the  notation  and  terminology  con- 
sistent with  current  literature.    In  particular,    the  notation  used  by  Kalman 
has  been  used  wh(  never  practicable. 

Vectors  are  designated  by  underlined  lower  case  letters.    All  vectors 
are  understood  to  be  column  vectors.    For  example,    the  vector    x    denotes 


i  ■■ 


(1.  1) 


Similarly,    matrices  are  designated  by  underlined  upper  case  letters. 
For  example,    the  matrix    A    denotes 


'11  12 


*21         *22* 


In 


2n 


a    .        a   ....  a 

n 1  n  2  nn 


(1.2) 


The  transpose  of  a  vector  or  a  matrix  is  designated  by  a  prime.    Thus 


«'-U,*a.  .  .*j 


(1.3) 


and 


A'- 


i       '21 


n  1 


32 "        '        nj 


a.  a.    ...  a 

n  2n  nn 


(1.4) 


The  inner  product  of  two  vectors  is  denoted  by    x'y_,      and  is  given  by 

n 


l  y 


i«i 


Consistent  with  this,    the  square  of  the  Euclidian  norm,    denoted  by 


£||2,     is  given  by 


Mill    -«'• 


(1.6) 


The  quadratic  form  of  a  vector  with  respect  to  a  symmetric  matrix 
A,      is  given  by     x' Ax.      For  convenience,    it  is  often  indicated  by 

ii     nl 


mA-*Ai 


(1.7) 


The  derivative  of  a  vector  or  a  matrix  with  respect  to  the  scaler 
variable,    time,     is  indicated  by  the  notation, 


dlj/dt 

d*2/dt 


x  =  dx/dt 


dx  /dt 


(1.8) 


and 


A  =dA/dt  = 


da,,/dt         da,,/dt  .  .  .  da,   /dt 

11  12  In 

da21/dt         daJ2/dt  .  .  .  daJn/dt 


da    ,/dt         da   ,/dt  ...  da     /dt 

nl  n2  nn 


(1.9) 


The  gradient  of  a  scalar  function  of    x    is  denoted  by 


VM(x)-*rad  V(x)- 


"(9V(s)/(9xl 

dV(x)/dx2 

d\(x)/dx 

n 

(1.  10) 


Similarly,    the  Jacobian  matrix  of  a  vector  function  of    x    is  denoted  by 


'„<5) 


dfl(l)/dxl        dfl(t)/dxa.  ..dil(ti/dxa 
df2(x)/dx  df2(j)/di  2  .  .  .  d(2(j$/dxn 


d(Jl)/dxl         dfn(x)/dx2.  .  .dfn(x)/dxn 


(1.  10a) 


1.  3     Problem  Statement 

Consider  the  system  described  by  the  vector  differential  equations 

i(t)  ~f_(x(t),  u(t),  t);      x(0)  =  c  (1.11) 

y(t)  =h(x(t),  t)  (1.12) 

where    x(t)    is  the  system  state  vector  and    y_(t)    is  the  system  output 
vector.    For  this  system,    it  is  desired  to  find  the  control  vector,     u  (t), 
such  that  a  performance  index,     J  (t),      is  a  minimum.    In  particular,    we 
will  assume  that    J  (t)    has  the  quadratic  form, 

J«"/     ]\     ]'Z-{T)-l(T)l,lJr)+\^{rKlr\dT  (1.13) 

where    z  (t)    is  the  system  desired  output,    and    Q  (t)    and    R(r)     are 
positive  definite  matrices  weighting  the  system  error  and  control  effort, 
respectively.    We  will  require  that  the  control,      u  (t),      be  expressed  as 
u(x(t),   z  (t))     so  that  it  can  be  realized  in  a  feedback  configuration. 

It  is  mathematically  convenient  to  consider  first  the  discrete  time 
version  of  the  same  problem  for  the  theoretical  development.    Actually, 
the  discrete  time  version  is  a  meaningful  problem  in  its  own  right.    It 
is  this  version  that  applies  when  a  digital  computer  is  used  to  synthesize 
the  controller. 

For  the  discrete  time  problem,    the  equations 

x(k  +  l)  -!(*(k),  u(k),  k);      x(0)-c  (1.14) 

y(k)-h(*(k),  k)  <h   15) 

replace  equations  (1.  11)  and  (1.  12),    and 

j«o-£  \  "^-tom1,,,,*  V  jllt<i)li;0)  (i.i6) 

j=w  -  j=k 

replaces  equation  (1.  13). 


1.4     Solution  of  the  Discrete  Time  Problem 

The  solution  of  the  discrete  time  nonlinear  optimal  control  problem 
is  sketched  here.    For  a  detailed  solution,    see  Chapter  II. 

In  order  to  proceed  by  dynamic  programming,    we  define  the  value 
function 

Wi«>%(10,.M'nUJNV,(t)l  <>-17> 

Then  by  the  "principle  of  optimality,  "    it  follows  that 

Min    (  1  2  1  ) 

v-',i<k))%jk),ill2-(k)-^)llo<Mtjii"-<k)ii;(.)+v»-'<i(k+i))(  d.i8) 

An  approximate  solution  to  this  equation  can  be  obtained  by  assuming 
l(k+l)-i(i*(k),E*(k),  k)  +  f  d°(k),  u*(k),  k)  (x(k)  -  x'(k)l  +  f^(x*(k),  u*(k),k)[u(k)  -  u*  (k)]    (le  19) 

y(k)-  h(x*(k),  k)  +fc    (K*(k),  k)  lx(k)-x'(k)]  (1.  20) 

and 


VN_k(x(k))  --||«(k)|!p  fc    +  x'(k)  x(k)  +  a(k)  (1.21) 


where    P  (k),  x  (k),      and    a  (k)     are  a  parametric  matrix,    vector,    and 
scalar  to  be  determined,    and  where    x    (k)     and    u    (k)     are  as  yet 
unspecified  points  about  which  we  linearize. 

The  approximate  solution  obtained  by  combining  equations   (1.  18), 
(1.  19).    (1.  20),    and  (1.  21)    is  given  by  the  equations 

u(k)  «-[R(k)  f  T  P(k+l)f  J"1  r|P(k^l)fxx(k)  +  P(k+1)  b(k)  +  x(k+l)|  (1.  22) 

ECkJ-h^QWh,  tf;g(k)£(k+l)  tM  (1.  23) 

x(k)  -  V  M(k)  [P(k+1)  h(k)  +x(k+l)l  -h'  Q(k)  lz(k)  -c(k)|  (1.  24) 


a(k)  =a(k  +  l)  +-  ||z(k)  -c(k)||*  +-||b(k)||2  +b'(k)s(k+I) 

2  QOO       2  P<k*l>      ~ 


-jl|P(W)t(k)+^k+l)||((  ,  ,-lf. 

£  u     —  u  —  u  u 


(1.25) 


where 


- 1 


and 


M(k)-I-P(k+l)f£[R(k)  +  rP(k+l)fJ      f„  (1.26) 

k<k)-£-f, I*  <*•)-*„■*(*)  (1.27) 

c(k)=h -hEx*(k)  (1.28) 


In  the  above  equations  the-  arguments  for     f,    f  .     f  ■     h,      and    h       have 

A  °  X  U  X 

been  omitted  for  simplicity.      They  are  understood  to  be  evaluated  at 
x    (k),     u    (k)     and     k,      as  appropriat 

The  boundary  conditions  for  equations  (1.  23),    (1.  24),  and  (1.  25)  can 
be  obtained  from  equations   ( 1  .  1  6)  and  (1.21).    They  ar 

P(N+l)-ft  (1.29) 

»i)=0  (1.30) 

^l)-0  (1.31) 

Notice  that  equation  a   (1.  -  1),    (1.  24),    and  (1.  2S)   must  be   solved 
backward  In  time ,     starting  at  time,     N+l,     where  the  boundary  conditions 

known,    and  working  ba<  rd  to  the  present  time     k.      This  implies 

that  the  desired  output,      /.  (k),      must  be  known  in  advance  so  that  the 
pararru -t«  rs     P     and    x     can  be  pre-computcd.    Once  these  parameters  are 
known,    the  control  sy  can  be  synthesized.     Figure  1.1     shows  a 

block  diagram  of  the  control  system  for  the  discrete  time  nonlinear 
optimal  i 

From    figure   1.1     u  <  an  b<  that  the  controller  for  the  system 

corisi  .i  time  varying  linear  feedback  portion,    and  a  director 

portion.    The  feedback  portion  oi  the  controller  will  insure  that  the 
system  will   be   r<  Lativi  Ly  ii  to  state  or  p  Leter  perturbations 

o(  i  urring  in  the  systi         ><■  Lng  <  ontrolled. 

il  on  of  stability,    which  paramount   importance  in  any 

control    system,    can  be-  answered  by  th<  of  the  second  method  ol 

ipunov.    By  using  th  >d,    it  can  b  wn  ti  a1  the  control  syst<  ms 


x| 


E 
P, 

<s> 

"o 

tJ 
d 
o 
U 

h4 
01 

o 

c 

a 

d 
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ol 


-X 

-Ol 

4 
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designed  from  the  theory  presented  here  are  always  stable.    A  more 
detailed  discussion  of  stability  is  contained  in  Appendix  B. 

The  theory  outlined  above  provides  an  approximately  optimal 
solution,    only.    How  near  optimal  the  solution  is  depends  on  how  near 
the  vectors    x*(k)     and    u*  (k),      which  must  be  given  beforehand,    are  to 
the  actual  state  and  control  vectors,      x  (k)     and    u  (k).      An  exact  solution 
to  the  nonlinear  control  problem  can  be  obtained  by  solving  the  equations 
for  the  approximately  optimal  solution  in  an  iterative  fashion. 

At  each  iteration,    the    x  (k)     and    u  (k)     determined  on  the  previous 
iteration  are  used  for  the    x*(k)     and    u*  (k).     If  convergence  is  achieved 
by  this  procedure,    the  solution  obtained  is  the  exact  solution  to  the 
nonlinear  control  problem.    The  question  of  under  what  conditions  the 
iterative  procedure  com  B  is   still  unanswered,    but  experience 

using  this  algorithm  on  a  digital  computer  indicates  that  convergence 
occurs  for  a  broad  range  of  problems,    and  that  convergence  is  usually 
achieved  in  three  or  four  iterations. 

The  theory  presented  In  this  >n  can  be  extended  to  systems  with 

stochastic  disturbances  by  minor  modifications.    However,    the  iterative 
algorithm  does  not  produce  an  exact  solution  in  this  case.    Details  for 
the  problem  when  stochastic  disturbances  are  present  are  given  in 
section  2.  7. 

1.  S       Solution  ol  the   Continuous   Time  Problem 


The  equations  specifying  the  solution  to  the  continuous  time  problem 
may  be  obtained  by  dynamic  programming  in  a  manner  analogous  to  that 
used  for  the-  discrete  time  problem.    These  equations  are 

u(t)  --  -R"'(t)  rlLMO  x(t)  fx(t)]  (1.32) 

P(t)  -  P(t)  fuR,x(0  fa  P(t)  -h;Q(t)ha-P(t)  f,-f^P(t)  (1.  33) 

i(t)  -  l.x  Q(0  U(t)  -eul  I  4  P(t)fuITl(t)r  x(t)  -P(t)b(t)  -fsx(t)  (1.  34) 


8 


a(t)=-L!!z(t)-c(t)!^(t)  +  i||i(t)||f2^.1(t)r+x(t)b(t)  (1.35) 

with  the  boundary  conditions 

P(T)=0  (1.36) 

k(T)=0  (1.3  7) 

a(T)=0  (1.38) 

Chapter  III  contains  a  full  development  of  the  theory  for  the  continuous 
time  problem.    The  question  of  system  stability  is  discussed  with  reference 
to  the  continuous  time  problem  in  Appendix  B. 

1.  6       An  Analytic  Example 

Consider  the  equations  of  motion  of  a  freely  spinning  body  about 
three  mutually  perpendicular  axes, 


xi-fl.5,:x3 


♦  u  i       x1(0)=c1  (1.39) 


ia"*a*ilt3  +  utJ       *2(°)-c2  (1.40) 


'.-•a*i»a  +  ua'       *3<0>"ca  (K41> 


where    x,,     x2,     and    x       are  the  angular  velocities,    where    Uj,     u2 ,     and 
u3     are  controls  proportional  to  torques,    and  where 

a,+a2-fa3=0  (1.42) 

These  equations  are  nonlinear  and  coupled. 

We  wish  to  determine     x    ,     x   ,     and   x       such  that  the  performance 
indi 

T 

j   =     f     [jq(t)[«*  ►  «;  +  **] +ir(i)[«J  +  u;  +  u|]j    dl  (1.43) 

o 

is  a  minimun  . 

The  solution  to  this  problem  can  be  obtained  exactly  and  analytically, 
it  turns  nut,    if  we  |  ed  in  the  same  manner  as  that  Indicated  In  the 

previous  section,     ["he  solution  is 


where 


and  where 


Ul(t)=  -MOx^t) 


u2(t)=  -k(t)x2(t) 


u3(t)  =  -k(t)x3(t) 


k(t)=p(t)/r(t) 


p(t)  =p2(t)/r(t)  -q(t);         p(T)=0 


for     r  (t)     and    q  (t)     constant,    that  is 

r(t)  =r 

q(0  =q 

the  solution  of  equation  (1.  48)  is 


p(T)  =rlc(T)  =ra 


1  -e 


2    IT 


1  +e 


where 


and 


fl-Vq/r 


T-T  -t 


(1.44) 
(1.45) 
(1.46) 

(1.47) 

(1.48) 

(1.49) 
(1.50) 

(1.51) 

(1.52) 
(1.53) 


A  block  diagram  of  this  c  ontrol   system  is  shown  in    figure   1.  2. 


Figure  1.  2 


L0 


A  detailed  derivation  of  the  control  equations  for  this  system,    as  well  as 
a  comparison  of  this  control  system  with  a  sub-optimal  one  that  uses 
constant  gain  linear  feedback,    is  contained  in  Chapter  IV. 

1.  7       Computer  Examples 

Consider  the  system  described  by  the  nonlinear  equations 

xfk+1)  =  x(k)  -0.05  x3(k)  +0.05  u(k);        x(l)=1.0  (1.54) 

y(k)  =  x(k)  (1.55) 

We  wish  to  determine    u(l),   .   .   .  ,     u(99)     such  that  the  performance  index 

100     .  99 


!        ^   -  Q|z(k)-x(k)]2  +   ^-  Ru2(k)  (1.56) 


2  *—*  2 

k  =  1  k  =  1 

is  a  minimum. 

The  equations  that  form  the  basis  for  the  iterative  solution  to  this 
problem  are  given  by  equations  (1.  1  9).    (1.  22),    (1.  23),    and  (1.  24).    In 
this  problem,    all  the  variables  appearing  in  these  equations  should  be 
interpreted  as  scalars.     Figure-   1.  5    shows  the  results  of  the  computer 
solution  of  this  problem  for  the  case  when    R  =  0.  01,      Q  =   10.  0,      and 
z  (k)  =  0     for    k    <    50,      but     /.  (k)  =1.0     for     k    >    SO.      The  iteration 
procedure  converged  (based  on  a  convergence  criterion  of  a     1  percent 
change  in  the  performance  index)  in  three  Iterations.    The  performance 
index  on  the-  third  iteration  was     12.  272. 

A  sub-optimal  controller,    with  the  control  determined  by 

u(k)-G[z(k)-x(k)]  (1.57) 

Where     G    was  equal  to  a  constant  gain  of     IS.  0,      when  operated  with 
the  same  nonlinear   system  gave  a  performance  index  of     13.  845. 

As  a  second  example  consider  the  system  described  by  the  equations 
Xjdt  +  l)  -zt(k)  +  0.01sa(k);  1,(1) -0.0  (1.58) 

ia(k+l)  -s3(k)  - 0.02 x t (k) -0.03  |*a00|  *2(k)  +  0.05  u  do,       *a(l)-3.0  (1.59) 

>r,(k)-*,(k)  (1.60) 

ya(k)-ia(k)  (1.61) 
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Again,    we  wish  to  control  this  system  such  that  the  performance  index 

100  ^99 

J-^|-QI[a1(k)-K1(k)]a  +  iQa[za(k)-xa(k)]H+    ^    -  Ru2(k)       (1.62) 

k= 1  k=l 

is  a  minimum. 

The  two-dimensional  version  of  equations  (1.  19),    (1.  22),    (1.  23), 
and  (1.  24)  form  the  basis  for  the  iterative  solution  procedure. 
Figure   1.4    shoves  the  results  of  the  computer  solution  of  this  problem 
for  the  case  when    R  =  0.  01,      Q,    =   1.  0,      Q2   =    1.  0,      zl  (k)  =  0     and 
z2  (k)  =  0.      Convergence  was  achieved  in  four  iterations,    and  the 
performance  index  on  the  fourth  iteration  was     29.  29. 

The  sub-optimal  controlled  with    u  (k)     determined  by 

■  (k)-G1[x1(k)-«1(k)]  +  Ga[*a(k)-Ea(k)]  (1.  63) 

with     G,   -  8.  50     and    G2  -  4.  75,      when  operated  with  the  same  nonlinear 
system  gave  a  performance  index  of    31.  32.      Chapter  V  contains  the 
results  of  several  additional  i  ompuUr  examples. 
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CHAPTER  II 
DISCRETE  TIME  SYSTEMS 
2.  1       Introduction 

The  theory  for  the  control  of  discrete  time  systems  can  be  developed 
more  simply  than  that  of  continuous  time  systems.    In  particular,    the 
discrete  time  theory  avoids  some  questions  about  the  existence  of  limits, 
etc.    For  this   reason,    the  discrete  time  theory  is  presented  first. 

This  chapter  first  considers  linear  discrete  time  systems  thoroughly. 
Then,    using  the  linear  results  as  a  guide,    the  theory  is  extended  to  include 
nonlinear  systems.    The  exact  solution  of  the  nonlinear  problem  is 
presented  in  the  form  of  an  iterative  algorithm.    The  final  section  of  the 
chapter  considers  the  problem  when  stochastic  disturbances  are  present. 

2.  2      Linear  Systems 

The  theory  for  the  optimal  control  of  deterministic  linear  systems 
has  been  worked  out  by  Kalman,4"7      Merriam,15,16    and  others.17"19     For 
this  case,    the  system  con.si  1  can  be  d<    i<  ribed  by  the  equations 

K<k+l)-E(k)i(k)+£(k)&(k);         i(0)  =  c  (2.1) 

£(k)-H(k)i(k)  (2.2) 

where     x  (k)     is  the     n-di  Lonal     system  state  vector,      u(k)     is  the 

r-dimensional     system  control  vector,    and    y_(k)     is  the     m-dimensional 
system  output  vector. 

The  perfoi  e  Index  is 

w-Sjlliw-rwC+2  i««<»lij«  (2-3) 

k  -  j=k 

where    z(j)    is  the  desired  output  vector 


is 


To  find  the  optimal  control  sequence,      u  (k),     u(k+l),   ....     u(N-l), 
the  method  of  dynamic  programming  is  used.    For  this  purpose,    we 

define  the  value  function, 

Min 
Vk(x(k))-  |j(k)|  (2.4) 

u(k),...,  u(N-l) 

We  then  invoke  the  " principle  of  optimality,"    which  states: 
"an  optimal  policy  has  the  property  that,    whatever  the 
initial  state  and  the  initial  decision  are,    the  remaining 
decisions  must  constitute  an  optimal  policy  with  regard 
to  the  state  resulting  from  the  first  decision.1'3 
Thus,    it  follows  that 

Min 
u(k) 

A  solution  for    VN.k    (x(k))     and    u(k),      (k  -  0,    1,   .  .  .  ,   N-l),      can  be 
obtained  by  assuming 


Min     I  1  2  I  2  ) 

Vk(-(k))  "um    I  7  "  -(k)  "^(k),,Q(M  +  ~2   "  a(k)l,IW+V^-«  (-(k+1))f  (2<  5) 


VN-k  fi(k))  r "7  I'ifk)!!^  k     f  x'(k)x(k)+a(k)  (2.6) 

where     P  (k),     x  (k),      and    a  (k)     are  a  parameter  matrix,    vector,    and 
scalar,    respectively,    to  be  determined.    By  combining  equations  (2.  5) 
and  (2.  6),    we  get 


(2.7) 


+  i||*(k+l)||2  +  x'(k+l)x(k+l)  4  *(k+I)! 


The  vector  variable    x(k+l)     can  be  eliminated  from  this  equation  by 
using  equation  (2.  I).    This  gives 

i  2  Mm    I  1  1 

-||x(k)|r        »-l'(k)£(k)+»(k)-  J-i!z(k)-y(k)ir        +a(k+l) 

2  E(")     ~         "  u(k)  (2  "  QW 

(2.8) 

f  i  'lF(k)x(k)  +Ci(k)ii(k)||2  +[F(k)x(k)  +G(k)u(k)|'   i(k+l)l 

2  "P(k  +  1)         ~  | 

e  minimizing  value  of    u  (k)    for  the  expression  on  the  right-hand 
side  oi  equation  {£.  H)  can  h<  rmined  by  ordinary  methods  of  calculus. 


This  value  is 

jl,.  (k)  =  ~[R(k)  +G'(k)P(k+l)G(k)]"1  G'(k)  [P(k+1)  F(k)x(k)  +x(k+l)]  (2.  9) 


i-i 


By  substituting  the  expression  for  the  minimizing  value  of    u(k)    into 

equation  (2.  8),    we  get 

-  lll(k)!|J        +  x'(k)x(k)  +  a(k)  =1  l'z(k)  -fj(k)r(k)||2 
2      ~  P(k)      ~  2  Q(k) 

-l||P(k+l)  F(k)x(k)  +  *(k+l)|!2      r  i-i     ,  (2,10) 

2  G(k)lR(k)+G    (k)P(k  +  l)G(k)J       G    (k) 

+  1  ||  F(k)x(k)  ||'  +  x '(k) F '(k)x(k+l)  +  a(k+l) 

2  P(k+ 1) 

This  equation  will  be  satisfied  for  all    x  (k)     if  and  only  if  the  following 
recursion  equations  are  satisfied. 

P(k)  -H'(k)Q(k)H(k)  +  F'(k)M(k)P(k+l)F(k)  (2.  11) 

x(k)  -  F'(k)M(k)x(k  +  l)  -H'(k)Q(k)z(k)  (2.  12) 

a(k)-a(k+l)--rz(k)||2        --'Ix(k+l)|!2       ,  .      „  vi-i     .  (2.13) 

2  0(k)         2  "G(k)lR(k)+G    (k)P(k  +  l)G(k)|      G    (k)  V  ' 


where 


M(k)  =1  -E(k+l)g(k)  (R(k)  +  S'(k)P(k+l)g(k)rl  G'(k)  (2-  14) 


The  boundary  conditions  for  this  set  of  equations  can  be  determined  from 

equations  (2.  ^)f    (2.4),    and  (2.6)  evaluated  at    k  =  N.      Thus 

P<N'+1)  -0  (2.  15) 

xfN  +  1)  =0  (2-  l6) 

a(N+l)  =0  (2.  17) 

rm  the  appropriate  boundary  conditions. 

Notice  that  equations  (2.  I  1),    {I.  12),    and  (2.  13)  must  be   solved 
backwards  in  time.    For  this  reason,    the   system   must  be  ''deterministic'1 
in   the   sense  that     /-  (j )     must   be   known  on  the   entire  interval, 
j  =  k,    k+1,  .  .   .  ,    N,     in  order  to  compute  the  optimal  control  vector  at 
ue    j    -   k.      Also  notice  that     a  (k)     is  required  to  determine  VN       (x_(k)), 
is  not  required  to  determine     u  (l<).       Thus,    L£  we  wanl   to  synthesize 
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the  optimal  control  system  and  are  not  interested  in  computing  the 
minimum  value  of  the  performance  index,    we  need  not  compute  the    a  (k) 
sequence.    A  block  diagram  of  the  optimal  linear  control  system  is  shown 
in  figure  2.  1. 

As  can  be  seen  from  the  block  diagram,    the  controller  consists  of  a 
time  varying  linear  feedback  portion  and  a  feed-forward  or  director 
portion.    The  feedback  signal  is  simply  the  system  state  vector  amplified 
by  the  time  varying  gain  matrix    P(k+l)F(k).      The  feed-forward  signal, 
x(k),      may  be  interpreted  as  a  modified  desired  output.    In  other  words, 
the  closed  loop  portion  of  the  system  tries  to  follow    x  (k)     instead  of 
z  (k)     because  it  is  more  economical. 

From  equation  (2.  12),    it  can  be  seen  that    x(k)     is  derived  from    z  (k) 
by  the  feedback  system  shown  in  figure  2.  2.    As  has  been  stated  previously, 
this  system  operates  backward  in  tim< 


lOO 


H'OOQ(k) 


x(k) 

UNIT 
ADVANCE 

x(k+l) 

J 

i 

P(k)M(k) 

Figure  fttem  tor  x(k) 

II  the  output  of  the  system  shown  above  follows  the  input  reasonably 
welli    using     -H'(k)  Q  (k)  z  (k)     in  place  of    x  (k)     for  the  feed-forward  input 
to  the  control  system  oi  figure  2    1   should  give  nearly  optimal  performance. 
This  would  eliminate  the  objectionable   requirement  of  having  to  know     z(j) 
entire  interval  in  advance. 
The  computational  procedure  for  determining  the  optimal  control  is 
at  from  the  nature  of  the  equations.    The   matrices     P  (N),    P(N-l),   .   .   .  , 
P  (o ),     and  the  vi  x  (N),     x  (N- 1 ),...,    x  (0)    must  be  pre -computed 

by  backwards  recursion  of  equations  (2.  11)  and  (2.  12).    These  quantities 
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would  then  be  used  along  with    x  (k)     to  determine    u  (k)     as  the  actual 
control  system  evolves  forward  in  time. 

Another  consideration  concerning  the  optimal  control  system  is  that 
of  the  measurement  of  state  variables.    For  the  preceeding  development, 
we  have  tacitly  assumed  that  the  state  variables  are  exactly  measurable. 
This  frequently  is  not  a  reasonable  assumption.    For  the  linear  problem, 

20    2  1 

Gunckel  has  shown  that  the  optimal  control  system  for  the  case  when 

the  state  variables  are  not  exactly  measurable  consists  of  the  control 
system  derived  above  with  an  optimum  filter  inserted  in  the  control  loop 
to  estimate  the  state  variables.    When  the  state  variables  are  not  exactly 
measurable  in  the  case  of  nonlinear  control  systems,    we  have  no 
assurance  that  an  optimal  filter  to  estimate  the  state  variables  inserted 
in  the  control  system  will  result  in  optimal  performance.    In  this  case, 

22 

however,    as  Cox       has  pointed  out,    if  the  state  variables  are  not  exactly 
measurable,    we  have  no  alternative  to  determining  the  optimal  control 
system  by  assuming  the  riables  are  exactly  measurable  and  then 

inserting  an  optimal  filter  in  the  control  loop.    In  all  that  follows,    we  will 
assume  that  the  state  variables  are  exactly  measurable.    Cox22    has 
treated  the  problem  of  estimating  state  variables  in  noisy  nonlinear 
systems. 

2.  3      Nonlinear  Systems 

The  theory  for  the  optimal  control  of  deterministic  linear  systems 
is  extended  to  a  fairly  general  class  of  nonlinear  systems  in  this  section. 
Actually  the  solution  derived  in  this  section  is  only  approximately 
optimal.    Section  2.  1   presents  an  iterative  procedure  based  on  this 
approximate  solution  that  leads  to  the  exact  solution. 

r  the  nonlinear  case,    the  system  considered  can  be  described  by 
the  state  equations 

«(k+l)  -i(z(k),  n(k),  k);        x(0)  =  c  (2.  IK) 

y(k+l)  -h(i(k),  k)  (2.  19) 
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The  performance  index  is 

N-  1 


jw.  £i|l*«>-i<<(n*  El"51811!,.  <2-20> 

We  follow  the  procedure  of  the  previous  section  and  define 

Mm 

VN.(x(IO)=  |j(k)|  (2.21) 

l(k) u(N-l) 

By  the  principle  of  optimality,    it  follows  that 

VN-"(-(k))  =  u  k   {2  "  "(k)  "  "(k)  "oc)  +  2  l,l(k),,ic»  +  VN-k-,(l(k+1))}       (2-  22) 

We  cannot  solve  this  equation  by  direct  methods;    so  we  resort  to 
linearization. 

The  approximations  an 

l(k+l)-i(|*(k),t*0'),  k)  +  f _d*(k),  i*(k),  k)[x(k)-x*(k)l  +  yx,(k).i/(k),k)[u(k)-u'(k)]        (2.  23) 

and 

y(k+D-  h(x*(k),k)  +hi(x*(k),k)[x(k)-x,(k)|  (2.  24) 

As  before,    we  assu 

VN.k(x(k))  -  i  ||i(k)  ||" ^  +  i'(k)ik)  +  a(k)  (2.  25) 

By  combining  equations  (2.  22),    (2.  23),    (2.  24),    and  (2.  25)  we  obtain  the 
single  equation 


1  a  .  Mm    \i  7 

-|lx(k)  I  fi'(k)i(k)  +  aik)  -  II  z(k)  -  y(k)  ! 

,i;,u(k)||a  !   ||f  +  f   lx(k)-x'(k)l  f  f    [u(k)-u*(k)l||2  (2.26) 

2  '    -  R(k)      2  -    ~  "     ~  P(k  +  i) 

+  [i+  fM(x(k)-x*(k))  i-fu(u(k)  -ji*(k))]'i(k+l)  +»(k+l)[ 

(When  the  arguments  oi     1,     t   ,     and    f      an  I,    tlu-y  are  understood 

—       —  a. 

to  b<  evaluated  at  x*  (k),  u*  (k),  and  k.  Similarly,  when  the  arguments 
oi  h  and  h  x  are  omitted,  they  arc-  understood  to  be  evaluated  at  x*  (k) 
and    k. ) 


The  minimizing  value  of    u  (k)     can  be  computed  by  the  ordinary- 
methods  of  calculus,    and  is  given  by 

u„ln00  =  -[R(k)  +  ^P(k+l)f„]'1^[P(k+l)fxx(k)+P(lc+l)b(k)  +  x(k+l)]         (2.  27) 

where 

k<k)-i-f,l*(k)-fBE#(k)  (2.28) 

When  the  minimum  value  of    u  (k)     from  equation  (2.  27)  is   substituted  into 
equation  (2.  26),    it  becomes 

-    kOOll!        +  x'(k)x(k)+a(k)-i|'z(k)-h    (k)-c(k)|l2 
2  £(>«)  2  -  9(k> 

-  l\\P<k>iu,*lk)  I  P(k+l)b(k>  +i(k+l)ti;  [R(k)^P(k  +  1),  fV     (2.  29) 

*  u    —  u—  u^  u 

+  -||f    x(k)  +  b(k)||2  +  [f  x(k)  +b(k)fi(k+l)+a(k+l) 

2     a.  p(w  +  d       *.- 

where 

t(k)-a-ta  i»(k)  (2.30) 

Thia  equation  will  be  satisfied  for  all     x  (k)     if  and  only  if  the  following 
set  of  equatioi 

E(k)-h^g(k)h    +f'M(k)£(k+l)fi  (2.  31) 

x(k)  -  fm'M(k)  [£(k+l)b.(k)   •  x  i  k  ♦  I  l  I  -  h  ;  Q(k)  !  ?(k)  -  c(k)|  (2.  32) 


(2.33) 


«(k)-a(k4l)  |  -      ,(k>  -Ilk(k)||a  +b'(k)x(k+l) 

2  ~  ij(w*i) 

-  -||P(k+i)b(k)  +  x(k+n|l2 .         ,  ,-i  , 

M(k)-i-P(k+l)fa[R(k)  +  r  P(k+1)  f^r'f.  (2.34) 

The  boundary  values  for  thia  set  of  equations  can  be  obtained  in  the 
Line  manner  as  for  th<-  Linear  problem.     rhe  boundary  values  are 

I'iN+D-O  (2.35) 

i(N»i>  (2.  36) 

»(N+l)-0  (2.37) 
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Once  the  sequence  of  points,      x*  (k)     and    u*  (k),      are  given,    the 
sequences     P(k),     x  (k),     and    a  (k)     may  be  computed  by  backward 
recursion  of  equations  (2.3  1),    (2.32),    and  (2.33).    After  these  quantities 
have  been  pre-computed,    the  system  may  be  operated  forward  in  time 
under  the  approximately  optimal  control  given  by  equation  (2.  27).    The 
problem  is,    of  course,    to  determine  a  sequence,      x*  (k)     and    u*  (k), 
about  which  to  linearize  such  that  the  approximation  is  a  good  one.    This 
is  the   subject  of  the  next  suction. 

Figure  2.  3    shows  a  block  diagram  of  the  nonlinear  control  system. 
Notice  that  although  the  system  being  controlled  is  nonlinear,    the 
controller  is  time  varying  linear. 

2.4       Solution  by  Iteration 

The  development  oi  the  theory  in  this  section  requires  us  to  attack 
the  optimal  nonlinear  control   problem  from  a  different  point  of  view. 
Consider  again  the  syst< 

l(k+l)  -i(i(k),  jt(k),  k)j        x(0)=-L  (2.38) 

y(k)  -h(x(k).  k)  (2.39) 

■ubject  to  the  pi  <  riterion 

,.    £    i|ltW-i«»ll'w4     g  illlWli;,,,  (2.40) 

=0  ~  0 

We  wish  to  choose     u  (k)     such  that  the  performance  criterion  is  a  minimum. 
Th<    minimization  can  be  performed  by  calculus  techniques  using 
;e   multipliers.1      For  tins   purpose    we  define  the   function 

N  N-  1 

1     ,  ..    .         , ,,3  \~>        1 


I  -   V  -  !  /.k»-h(x(k),k)!l2       +    V    i||u(k)||' 


Li   =0 

(2.41) 


N  -1 

+ 


^    A'(k)  I  x(k  +  !)  -l(x(k),  u(k),  k)|  +  A(-l)lx(0)  -c 


1  Thi  •  approa<  h  is  aimilar  to  that  used  by  Kipiniak28    and  the  entire 
elopment  oi  this  section,    Ln<  Luding  the  iterative  procedure,    is  cloa 
Lated  nonlinear  smoothing  problem  treated  by  Cox. 
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By  equating  partial  derivatives  of    I    with  respect  to    u(k),     Mk),      and 
x  (k)     to  zero,    we  obtain  the  following  set  of  equations  which  define  the 
optimum  control  system. 

u(k)  =R"1(k)f^(x(k))  u(k),  k)  A(k)  (2.42) 

x(k+l)  =i(x(k),  u(k),  k)  (2.43) 

A(k-l)  -r(x(k),  u(k),  k)  A(k)+h^Q(k)[z(k)-h(x(k),  k)]  (2.44) 

The  boundary  conditions  are 

x(0)  =  c  (2.45) 

and 

A(N)=0  (2.46) 

This  set  of  equations  is  nonlinear,    and  an  analytic  solution  is  not 

known.    However,    we  can  obtain  an  approximate  solution  by  using  the 

linearizations 

x(k+l) -i(x'(k),  u'(k),  k)  +  f  d?(k),  u»(k),  k)[x(k)  -x*(k)] 

♦  f    (x«(k),  u'(k),  k)  |u(k)  -u«(k)l 


(2.47) 


and 

y(k)- h(x»(k),  k)  +hK(x«(k),  k)[x(k)-x*(k)l  (2.48) 

When  we  use  these  approximations  instead  of  equations  (2.  38)  and  (2.  39), 

the  equations  for    u  (k),    x(kfl),      and    \(k-l)     become 

u(k)  =R",(k)fi;  A(k)  (2.49) 

x(k-t-l)  -f  4  lx  lx(k)  -x*(k)l  +  fu  [u(k)  -u»(k)l  (2.  50) 

A(k-l)  -f;A(k)  fh^Qdt)  U(k)  -h  -h    (x(k)  -  x*(k)  1 1  (2.  51) 

where    f,     f   ,     and    f       are  understood  to  be  evaluated  at    x*  (k),     u*  (k), 
and    k,      and    h    and    h  K    are  understood  to  be  evaluated  at    x*  (k)     and    k. 
We  can  solve  the  above  set  of  equations  by  assuming 

X(k-l) --P(k)i(k)-t(k)  (2.52) 
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The  solution  proceeds  by  combining  equations  (2.49),    (2.50),    (2.51),    and 
(2.52)  to  eliminate    u  (k),    X.(k),     and   \(k-l),      obtaining 

x(k+l)  -i  +  f   [*(k)-f?0E)]  -fuli*(k)  +  fuR"1(k)^[P(k+l)x(k+l)  +  x(k+l)]  (2.  53) 

and 

P(k)x(k)  +  i(k)  =  f'x  [P(k+l)x(k+l)  +x(k+l)]  -h^Q(k)  tz(k)  -h -hx  [x(k)  -x«(k)]|  (2.  54) 

These  two  equations  can  be  combined  to  eliminate    x(k+l),      giving 

P(k)x(k)  +  x(k)  =  fix(k+l)  -h^Q(k)  |z(k)  -h -hjx(k)  -x*(k)]| 

(2.55) 
+  fxP(k+l)[i+fuR-1(k)rP(k+l)]"1  U  +  fjx(k)  -x*(k)]  -fuu«(k)  -  fu  R/*(k)  f  '  x(k+l)| 

provided  the  inverse  indicated  exists.    (Section  2.  5  contains  a  proof  that 
the  inverse  required  above  does  indeed  exist.  ) 

Equation  (2.  55)  will  be  satisfied  for  all    x  (k)     if  and  only  if  the  following 
set  of  equations  are  satisfied. 

P(k)  -rP(k+l)U  +  f^R-Vk^PCk+ur'f,  +h^Q(k)h£  (2.  56) 


l(k)-ri(k+l)-h^Q(k)[j.(k)  -  c(k)l  -  f,;P(L*l)[I  +  fuR-1(k)f^P(k+l)l"  [fuR-'(k)rx(k+l)-b(k)l  (2.  57) 

We  are  now  in  a  position  to  obtain  an  exact  solution  to  equations  (2.42), 
(2.43),    and  (2.44),    and  hence  an  exact  solution  to  the  nonlinear  control 
problem.    The  exact  solution  is  obtained  by  solving  equations  (2.  49),    (2.  50), 
(2.52),    (2.56),    and  (2.  57)  iteratively. 

First,    we  denote  the  state  sequence  and  the  control  sequence  obtained 
on  the     ith    iteration  as     x,(0),   .   .  .  ,    x((N)     and    uf(0),   .   .  .  ,     u((N-l), 
respectively.    Then,    for  the     i+  1st     iteration  we  linearize  about  the  points 
x  (k)     and    u  (k).      The  procedure  for  the     i+l<r/     iteration  is  as  follows: 

Step  1.     Solve  equations  (2.  56)  and  (2.  57)  backward  in  time  using 

x*(k)  =  xl(k)  (2.58) 

and 

u*(k)-Ei(k)  (2.59) 

to  compute    P  (N),  .   .  .  ,    P  (0)     and     x  (N),   ....     x  (0). 
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Step  2.      Solve  equations  (2.49),    (2.  50),    and  (2.  52)  forward  in 
time  using 

x«(k)=x1(k)  (2.60) 

and 

u'M-u^k)  (2.61) 

as  before,    to  compute    ^+,(0),   ....    u      (N-l)    and 
xi+i(0),   .  .   .  ,     xi  +  j(N). 
Steps   1   and  2  are  repeated  until  convergence  is  achieved,    i.  e.  ,    until  the 
norms  of  the  quantities    [  x      (k)   -  x  (k)]      and    [u1+1(k)   -  u^k)]      are  less 
than  some  previously  specified  convergence  criteria.    It  can  be  seen  by- 
comparing  equations  (2.49)i    (2.  50),    and  (2.  51)  with  equations  (2.42), 
(2.43),    and  (2.44)  that  if  convergence  is  achieved  using  the  iterative 
procedure,    that  is  if 

x1  +  1(k)=xt(k)  (2.62) 

and 

lm(k)-ft|(k)  (2.63) 

then  the  solution  obtained  La  the  exact  solution  for  equations  (2.42), 
(2.43),  and  (2.  ill  as  well.  In  other  words,  the  solution  obtained  by 
convergence  of  the  Iterative  j  dure  is  tin-  exact  solution  to  the 

optimal  nonlinear  control  problem.     The  question  of  under  what  conditions 
convi  rwence  can  be  assured  is  ..  difficult  one,    and  as  yet  has  not  been 
answered  by  the  author.    This  remains  a  challenging  area  for  possible 
future  research.    However,    computer  studies  using  this  iteration 
procedure  indicate  that  converg<  nee  usually  occurs  in  a  few  iterations. 
Chapter   V  contains   some  of  these  result 

Because  the  inverse  in  equations  (2.  ^6)  and  (2.  57)  is  generally  more 
difficult  to  compute  than  the  Inverse  occurring  in  the  solution  of  the  last 

ction,    we  would  prefer  to  use  equations  (2.3  1)  and  (2.32)  as  the  b.tsis 
for  tl  rative  algorithm  in  lieu  of  equations  (2.  56)  and  (2.  ^7).    However, 

nothing  we  have  shown  thus  far  would  permit  us  to  do  this  and  still 

irantee  that  a  convergent  solution  for  the  iterative    algorithm  i  i>  the 
exact  solu'  onlinear  i  ontrol  problem. 


27 


We  can  show  that  the  iteration  scheme  based  on  equations   (2.  31)  and 
(2.  32)  does  lead  to  the  exact  solution,    and  in  fact  is  identical  to  the  scheme 
based  on  equations  (2.  56)  and  (2.  57)  by  using  the  following  matrix  identities. 

U+  fuB"1(k)fu'P(k+l)r1  3  I-f  [R(k)  +  rPCk+DfJ"1  rP(k+l)  (2.  64) 

and 

H+fiiR-,(k)fi;P(k+l)r,fiiR-1(k)f^^[R(k)  +  ^P(k+l)fur1^  (2.65) 

(Appendix  A  contains  a  proof  of  these  identities.  )  The  application  of 
identities  (2.  64)  and  (2.  65)  to  equations  (2.  56)  and  (2.  57)  immediately 
transforms  them  into  equations  (2.  31)  and  (2.  3  2).    In  addition,    since  by 
equation  (2.  49) 

u(k)  =  R"l(kW:;  A(k)  (2.66) 

or,    using  (2.  52), 

u(k)  =-R-l(k)f^[P(k+l)  x(k+l)  +  i(k+l)]  (2.  67) 

and  by  (2.  50) 

u(k)  =-R-,(k)fi;  |P(k+l)[I+  fx(x(k)  -i»(k))  +  fu(u(k)  -u»(k))]  +  i(k+l)|      (2.  68) 

Solving  this  equation  for    u  (k)     explicitly  yields 

u(k)  =  -[  Rfk)  +  f^P(k+l)f  r1  \*  lP(k+l)ll  I  fx(x(k)  -?*(k))  -fuu*(k)l  +x(k+l)|     (2-  69) 

which  is  identical  to  equation  (2.  27).    Thus  we  have  shown  the  solution 
based  on  the  equations  derived  in  this   section  is  identical  to  the 
solution  based  on  the  equations  of  the  previous  section. 

2.5      On    P  (k)    and  I  1   .    1    R"'(k)f   P  (k+l)!"1 


Thia  section  contains  two  theorems  of  importance  to  the  material  in 
this  chapter.    The  first  theorem  concerns  the  i  f  [  I  +  fy  R'1  (k)fu  P  (k+ 1 )  I    , 

and  the  second  theorem  concerns  the  non-negative  deiiniteness  of  P  (k). 
The  proof  oi  these  theorems  will  require  some  elementary  results  from 
matrix  theory.    These  ai 

a.     If  the    n  x  n     matrix    P   is  non-negative  definite,    then  the  matrix 
G' PG    is  non-negative  definite,    where    G    is  any     n       r     matrix. 
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b.  The  inverse  of  a  positive  definite  matrix  exists  and  is  positive 
definite. 

c.  The  sum  of  a  positive  definite  matrix  and  a  non-negative  definite 
matrix  is  positive  definite.  , 

d.  The  sum  of  two  non-negative  definite  matrices  is  non-negative 
definite. 

Theorem  1:     If    R  (k)     is  positive  definite,    and    P(k+1)     is  non-negative 

'  i- 1 

definite,    then  the  inverse    1 1_  +  fuR  '  (k)  f  uP  (k+ 1  )J         exists. 

Proof:     Consider  the  matrix  expression 

I-fjROO+f^Pdc+UfJ"1  rP(k+l)  (2.  70) 

If    P(k+1)     is  non-negative  definite,    then  by    a.,      f'P(k+l)f       is  non- 
negative  definite.    If    R  (k)     is  positive  definite,    then  by     c.  ,      R  (k)  +  f'P(k+l)f 
is  positive  definite,    and  hence  by    b.  ,     [R(k)  +  f^P(k+l)fJ'      exists.    Thus 
the  whole  expression  exists.    But,    by  the  first  identify  of  section  2.4, 
(_Ii  f  R    (k)f'P(k+l)l         is  identical  to  the  expression  above  and  hence 
must  exist. 

Theorem  2:     If    R  (k)     is  positive  definite,    and  if    Q  (k)     and    P(k+1)     are 
non-negative  definite,    then     P  (k)     is  non-negative  definite. 
Proof:     Consider  equation  (2.  56),    rewritten  here. 

P(k)  -  f^P(k+l)  U  +  fuR",(l<)rP(l<+l)rl  tx  +  hK'Q(k)hx  (2.  71) 

If    Q  (k)     is  non-negative  definite,    then  by    a.,      h'Q(k)h  is  non-negative 
definite.    As  for  the  first  term  on  the  right  of  (2.  56),    it  must  be  non- 
negative  definite  also  if    P  (k+ 1 )    is  non-negative  definite.  To  show  that 
this  is  so,    let 

1 1  +  f   R-l(k)('P(k+}.\'1  f    -  A  (2.  72) 

then 

f,-tl  +  fu!rl(k>*«£(k+l>]  A  (2.  73) 
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Thus  we  see  that 

r  P(k+l)[l+fuR-1(k)f^P(k+l)]"1  fx  =  A'[l+fuR-1(k)rP(k+l)]/P(k+l)  A  (2.  74) 

or 

r  P(k+1)  [1+  fuR-J(k)r  Pfk+l)]"1  fx  =  A'P(k+l)A  + A'P(k+l)fuR'1(k)f^P(k+l)  A  (2.  75) 


But  by  a.  ,  and  d.  ,  the  right-hand  side  of  equation  (2.  75)  is  non-negative 
definite.  Hence  for  the  same  reason,  the  right-hand  side  of  equation  (2.  56) 
is  non-negative  definite,    completing  the  proof. 

The  hypotheses  of  theorem  2  are  satisfied  by  the  original  assumptions 
of  the  problem  statement.    The  hypotheses  of  theorem  1  are  satisfied  by 
the  original  assumptions  in  the  problem  statement,    and  by  the  results  of 
theorem  2.    Thus  theorem  1   applies  to  equation  (2.  55)  in  section  2.4. 

2.  6      An  Alternative  Linearization  Procedure 

There  are  other  possible  linearization  procedures  that  can  be  applied 
to  the  nonlinear  control  problem.    One  procedure  suggested  by  Pearson24 
has  the  advantage  of  being  computationally  simpler  than  the  methods  of 
sections  2.  3  and  2.4,    but  it  is  theoretically  less  attractive. 

To  present  th<  ry  lor  this  method,    we  follow  the  approach  used  in 

section  2.  3.    However,    instead  of  the  linearization  used  there,    we  use  the 
following  Linearizations. 

x(k  +  i)  -  E(l*(k).  H*(k),  k)  l(k)  Kj(x/(k),  u'(k).k)  u(k)  (2.  76) 

y(k)^  »_«(*•  (k),k)  «(k)  (2.77) 

where     F    and    G    are  determined  such  that 

!<i(k).  u(k),  k)      F(x(k),  u(k),k)  x(k)  t-G(x(k),  u(k),k)u(k)  (2.78) 

h(«(k),  k)      H(*(k),k)i(k)  (2.  79) 

This  type  ol  linearization  is  not  unique,    and  it  is  an  open  question  as  to 
which  linearization  of  this  type  is  best.    However,    in   many  instances 
there  is  an  obvious   Intuitively  appealing;  way  to  proceed. 
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As  an  example  of  such  a  linearization,    consider  the  scalar  nonlinear 
function 

f(x(k),  u(k),k)  =  -x3(lo+v'^(kT  (2.80) 

One  possible  linearization  is 

f(x(k),  u(k),k)  -  -x'2(k)  x(k)  +  u'-2/3(k)  u(k)  (2-  81^ 

Another  one,    arbitrarily  chosen,    is 

f(x(k),  u(k),k)- [-x,2(k)-u*(k)l  x(k)  +  [u*-2/3(k)  +  x*(k)]u(k)  (2.  82) 

The  first,    of  course,    is  intuitively  more  appealing. 

By  using  the  linearizations  outlined  above  instead  of  equations  (2.  23) 
and  (2.  24),    equation  (2.  26)  becomes 

(2.83) 

+  -||Ei(k)  +  gft(k)     '  +  [Fx(k)  +  Gu(k)l'x(k+1)  +  a(k+m 

2      ~  "  f(k  +  i)        ~  -  -  -  -  I 

(When  the  arguments  of    F,     G,     and   H    are  omitted,    they  are  understood 
to  be  evaluated  at  the  points     x*  (k),    u*(k),    and    k). 
The   minimizing  value  of    u  (k)     is 

u(k)  --|R(k)  +fi'P(k+l)grl  G'lP(k+l)F*(k)  +  x(k+l)]  (2.  84) 

When  this  value  of    u  (k)     i s   substituted  into  equation  (2.  83),    we  get 

-  Ill(k)|l!        +i'(k)£(k)  +a(k)  =  -||L(k)-Mx(k)||2 
2  £<><)  0(w) 

--  |!P(k4l)Fx(k)  +  "x(k  +  l)"2  ,  ri    ,  (2.85) 

2  Ik)  +£i      P(k  +  1)  g]       G 

+  -||Fx(k)||2  +  x'(k)F'x(k+l)  +  a(k*I) 

2      —  P(k  +  i) 

This  equation  will  be  ied  for  all    x  (k)     if  and  only  if  the  following 

equations  are  satisfied. 

P(k)  -H'Q(k)H  +  E'M(k)P(k+l)£  (2.  86) 

l(k)  -  K'M(k)x(k+l)  -IJ'Q(k)z(k)  (2.  87) 

«(k)-il|/.(k)l'2         -i-||x(k+l)||2,  ,  ,-l    ,  (2.  88) 

2  0(k)        2  GlK(k)  +  r.    P(k+l)Oj       £. 


where 

M(k)  =I-P(k+l)G[R(k)  +G'P(k+l)G  J"1  G'  (2.  89) 

The  boundary  conditions  are  again 

P(N+1)=0  (2.90) 

x(N+l)=0  (2.91) 

a(N+l)  =0  (2*  92) 

As  can  be  seen,    these  equations  are  identical  in  form  to  the  solution 
equations  for  the  linear  system.    The  only  difference  is  that  the  matrices 
F,     G,     and   H    in  this  section  are  functions  of    x*(k)     and    u*(k)     as  well 
as  of    k. 

An  iterative  type  solution,    similar  to  that  introduced  in  section  2.4 
is  possible  here  also.    However,    we  cannot  show  that  this  iterative  solution 
converges  to  the  exact  optimal  nonlinear  solutions.    The  reason  for  this 
can  b'  n  by  comparing  equation  (2.44)  of  the  exact  optimal  nonlinear 

solution,    rewritten  here, 

A(k-l)-r(*(k).  u(k),  k)  A(k)  +h;g(k)[z(k)-h(x(k),  k)l  (2.44) 

with  the  equation  corresponding  to  equation  (2.  SI)  when  the  approximations 
of  this  section  are  used.    This  equation  would  be 

A(k-l)  -  F'  A(k)  +  |}'Q(k)  '  /(k)  -  IU(k)|  (2-  93) 

It.  La  obvious  that  equation  (2.  93)  will  not  approach  equation  (2.44)  as 
x  (k)     approaches     x*(k).      Thus  the  convergent  solution  of  the  iteration 
procedure  bas  •  quations  of  this   section  will  not  in  general  be 

thi    exact  optimal  solution.    We  could  only  hope  that  this  solution  would 
be  very  near  the  true  optimum. 

2.  7      Nonlinear  Systems  with  Stochastic  Disturbances 

This  on  prest"  ique  for  controlling  a  nonlinear  system 

that  is  subject  to  stochasti<    disturbances.    Such  a  system  can  be  described 

by  the  equations 

t(k+l)  -i(i(k),  u(k),  It)  +x(k);         x (0)  =  c  (2.94) 

y(k)-h(x(k),  k)  (2.95) 
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where    r  (k)    is  an   n -dimensional  random  vector  such  that     r(j)     is 
independent  of    r  (k)     for    j  /  k.      Thus     r  (k)     is  essentially  the  discrete 
time  equivalent  of  white  noise. 

If  the  nonlinear  system  we  are  interested  in  controlling  is  disturbed 
by  a  random  input  that  is  not  independent  as  described  above,    but  instead 
is  disturbed  by  a  random  vector  that  can  be  described  by  the  difference 
equation 

x(k)  - g(x(k),  k)  +  j(k);        i(0)-w  (2.96) 

where    w  (k)     is  an  independent  random  sequence,    then  the  system 
equations  can  be  transformed  into  the  form  of  (2.  94)  and  (2.  95)  by 
augmenting  the  state  variables.    This  can  best  be  illustrated  by  a  simple 
example. 

Suppose  the  system  is  described  by  the  equations 

x(k+l)  =  x(k)  ufk)  r(k)  (2.  97) 

and 

r(k+l)-ar(k)  +  w(k)  (2.98) 

where    w  (k)    is  an  independent  tr   random  variable.    We  can  define  an 

augmented  state  vector 

""*(kf 
r(k) 


x(k) 


x,(k) 


x2(k) 


(2.99) 


and  write  the  system  equations  as 

x(k+l)  =!(x(k),  u(k),  k)  +£(k) 
where 

!(x(k),  u(k),  k)  - 


and 


r(k) 


f«,'U 

u(k)  x2fk) 

■*a(k) 

"    0    " 

_w(k)_ 

(2.  100) 


(2.  101) 


(2.  102) 


which  is   in  the  form  of  equation   (2.  94). 


Because  the  variables  involved  in  equations  (2.  94)  and  (2.  95)  are 
stochastic,    a  reasonable  performance  index  will  involve  an  expectation, 
Thus  we  assume  the  performance  index  is 

J(k>-  ]Y  Mk<i)-£(j)l  \on+Y    i|ll0)1l*  J        (2.103) 

1(k),...,1(N-i)   ^2  -        QCJ)     Lu    2  50)j 

In  order  to  proceed  by  dynamic  programming,    we  define  the  value  function 

Mm 

VN.k(x(k))=  tj(k)l  (2.104) 

a(k) u(N-i) 

Bellman3    shows  that  when  the     r_(k)     sequence  is  independent,    the 
principle  of  optimality  implies 

Min       F.xp    [  i  i  j 

W£(k»-  ;     i(k)-y(k)|l         +-||u(k)!|        +vN.k.l(x(k+i))      (2.105) 

u(k)    x(k)    \  L  v  v«;       z  _  <•   >  | 

As  before,    if  we  assume 

l(k+l)-  l+t    (x(k)-x»(k)l  +  fju(k)  -u«(k)l  +  r(k)  (2.  106) 

y(k>-  h  +hB  [x(k)  -x«(k)l  (2.107) 

and 

VM^(l(k))--||i(k)||*       fi'(k)£(k)+«(k)  (2.108) 

then  we  obtain 

i  i  .  s,in      '  *'     \  I  2  1  2 

illi(k)ll  +  i'(k)*(k)+a(k)«  -|lz(k)-y(k)||  +-||u(k)|l 

2  £(")  u(k)     rlk)    i-'  Q(k)      2  -(k) 

u(k)  im  i  _  (2>  109) 

+  7  lU.S.tk)  +  k(k)  fl(k)|   *  +[f,S[k)  +  fjiJlj(k)  +  Kk)  +l(k)]'i(k+1)  +a(k+l)'. 

where 

k(k)-I-fai#(k)-f£u#(k)  (2.110) 

Performing  the  expectation  operation  and  then  the  minimization  operation 

Exp 

yields,    assuming  tr(k)l 


o(k)  ■  -|R(k)  I  f'P(k+l)f  J    r  lP(k+l)  tmlM  •  P(k+l)b(k)  •  i(k+l)]  (2.  Ill) 

and 
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Exp 


-      l(k)!!2         ^r'(k)i(k)+a(k)=I||2(k)-h    x(k)-c(k)H2       +        '    |  r'(k)P(k+l)r(k)i  +  a(k+l) 
2  £00  2      ~  -~  Q(k)    ^k) 

-iiiP(k+i)f£x(k)  +  p(k+i)b(k)  +  ;(k+i)!i;        ,        rlf,     (2.H2) 

^  H.  ii.  .H  ii. 

+  -  I !  f„  l(k)  ♦  b(k)  II ^M)  +  [f,  x (k)  +  b (k)l '  £(k+l) 

c(k)=h-hxx*(k)  (2.113) 


where 


This  equation  will  be  satisfied  for  all     x  (k)     if  and  only  if  the  following 
equations  are  satisfied: 

P(k)  -h;  Q(k)  hx  +  r  M(k)  P(k+l)  f  (2.  1 14) 

L(k)  -f,'M(k)  [P(k+lJ  +  b(k)  *i(k+l)]  -hx'Q(k)  U(k)  -c(k)]  (2.  115) 


a(k)  -a(k  +  l)  +-||z(k)-c(k)||2         +-!|b(k)l,J  +b'(k)x(k  +  l)  +  l/(k)  P  (k+l)X(k)l 

2  QW      2  £(*♦»)  r(k) 

--||P(k+l)b(k)  +  x(k+l)||J  ,  ,  ri    , 


(2.  116) 


u  u 


where 

M(k)  -1-P(k+1)  f    [E(k)  +  f^P(k+l)  fj'1  f^  (2.  117) 

The  boundary  values  a 

P(N+1)  =0  (2.  118) 

l(N+l)=0  (2.119) 

.(tM)-O  <2'120> 

These  equations  are  identical  to  equations  (2.  3  1)  through  (2.  37)  except  for 
the  additional  expe<  tation  term  in  equation  (2.  116). 

In  essence,    these  equations  are  the  solution  to  the  optimal  control 
problem  for  the  linearized  system.    This   solution  differs  from  the  exact 
optimal  nonlinear  solution  because  the  linearized  system  only  approximates 
nonlinear  system.    In  section  2.4  we  were  able  to  improve  this  approxi- 
mation by   a  rative  technique   so  that  eventually  the  exai  '  tion  was 
What  are  the  pro                              milar  proci  dun              is  cast 


An  examination  of  the  iterative  procedure  of  section  2.4   reveals  that 
the  technique  was  dependent  on  being  able  to  predict  exactly  the  state  at 
time     k+ 1     which  results  from  the  application  of  a  known  control  signal  to 
the  system  in  a  known  state  at  time    k.      Unfortunately,    because  of  the 
random  disturbance,      r  (k),      this  is  impossible  for  the  system  considered 
in  this  section. 

We  can,    however,    use  the  following  iterative  algorithm  to  obtain  an 
approximate  solution: 

Step   1.      Solve  equations  (2.  114)  and  (2.  115)  backward  in  time  using 

*'(,)-«,(.)  (2.121) 

|.*(i)-l|(l)  (2.122) 

to  compute     P  (N),   .   .   .  ,    P(k)     and    x(N) x(k). 

Sup  2.      Solve  equations  (2.  Ill)  and  (2.  106)  forward  in  time  with 

id)     0;         }>k  (2.123) 

and  again  using 

.Ml)  -!,«)  (2.  124) 

U*m)      ",(i)  (2.  125) 

to  compute     u_m(k),   ....    u|+l(N-l)    and   x,+1(k),   ...»    x|h(N). 

The    i+lsr    iteration  would  then  i  ed  using  the  extrapolated  control 

ctors,      u  ,+,(j)»      and  the  extrap  d   state  vectors,      Xj  +  i^J)'     Just 

computed  in  place  of    u*  (j )     and     x*(j).      The  procedure  would  be  repeated 
until  satisfactory  conv<  rgem  e  had  been  achieve. 1. 

This  algorithm  should  provide  ti(  arly  optimal  performance  when  the 
P(j)      and     x(j)    obi  in  thi  hion  rate  the  control 

the   real   s\  As  tim<  i  on,    and  the  true  state  deviates  more 

and  more   from  the  extrapolated  state,    the   peri  e  will   slowly    I 

ided. 

One  way  to  overcome  partially  this  degradation  of  prrh.nuance  is  to 
update  the   solution  periodi<  .i Uy  by  measuring  the  currenl     I  ite  oi  the 
system,    an<  using  tliis  state  as  the  starting  poinl  for  a  recomputation 


of    P(j)    and    x(j),     using  the  same  iterative  procedure  as  before.    Of 
course,    this  would  require  that  the  iterative  algorithm  be  executed  in 
much  faster  time  than  the  real  system  evolves. 

Some  computer  results  using  this  approach  are  presented  in 
Chapter  V. 
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CHAPTER  III 
CONTINUOUS  TIME  SYSTEMS 
3.  I       Introduction 

Even  a  cursory  examination  of  the  results  of  Chapter  II  shows  that 
the  control  systems  required  by  the  theory  are  of  such  complexity  that  a 
high  speed  digital  computer  will  generally  be  required  to  investigate  or 
to  synthesize  the  control  system.    However,    for  the  few  analog  control 
system  applications  that  may  be  possible,    and  for  a  few  special  nonlinear 
control  problems  that  can  be  solved  analytically,    a  continuous  time  theory 
is  required. 

The  purpose  of  this  chapter  is  to  develop  the  theory  for  the  control 
of  continuous  time  nonlinear  systems.    This  theory  is  developed  in  a 
manner  analogous  to  that  used  in  Chapter  II  for  the  discrete  time  systems. 
It  should  be  mentioned  here  that  Kalman7    and  Merriam15-16      have  developed 

theory  for   Linear  continuous  time   systems. 


3.  2      Linear  Syst 


ems 


Consider  the  linear  control  system  described  by  the  equations 

i(t)  =  Flt)x(t)  -  (_,(t)u(t);         x(0)=c  (3.1) 

y(t)-M(t)£<t)  <3'  2) 

where      x  (t)     is  the    ii -dimensional    system  state  vector,     u(t)    is  the 
r-dimensional    control  or  input  vector(    and    y_(t)     is  the    m-dimensional 
system  output  vector.    As  indicated  by  the  notation,    the  transformation 
matrices     F(t),    G  (t),     and    H  (t)    as  well  as  the  vectors    x(t),     u(t),     and 
y_(t)     can  vary  continuously  with  time.    For  this  system  we  wish  to  find 
the  control     u  (v  )     on  the  interval    t  <  t    <  T     such  that  the  performance 
ind' 

J(0-  j   [jlli<r)-^r>l|2Q(T)  +  jlla<rHlg(r)    ««t  (3.3) 

is  a  minimum.     Hen,      /.  ( i  )    is  the  desired  output  of  the  system. 


J  8 


We  define  the  value  function 


Mm 


V(x(t),t)  =     u(r)       |j(t)| 
t  <  r  <T 

By  the  principle  of  optimality,    we  have 

V(x(t),t)=         ujr)  /  i||£(r)-y(r)||*       +i||u(r)||^ 

t  .<  r  .<  t  +  At  f  J\  L  -  "    - 


(3.4) 


(It 


+  V(x (t+At),  t+At)'    (3.  5) 


If  we  expand    V  (x  (t+At),  t+At)     in  a  Taylor  series  about  the  point 


[x  (t),  t]  ,     we  get 

Mir, 
V(x(t),t) 


-  llzC-) -y(T)  |l '       +-||u(r) 


OCO      2 


U(T) 

t  <  T  <  i  -f  A t     '  "t 

+  V(x(t),t)  + V  (x(t),t)  At  +  V'(x(t),t)lx(t+At)-x(t)l+0(At)l 

\ 


dT 


(3.6) 


When  we  take  the  limit  as    At    approaches  zero  (provided  it  exists,    etc.  ), 
equation  (3.  6)     becomes 


Min 


v,+ 


u(c) 


,-iw"im+;,,lW,,i»*v*it0 


=  0 


(3.7) 


or 


Mm 


V 


u(t) 


E(t)-y(c)||'        l--||a.(0||?      +  V  F(t)i(i)  +  v;G(t)u(t) 

<V(t))  R(t)         _  _ 


0       (3.8) 


The  mini  mi  za'  be  p<  r'ormed  by  ordinary  methods  of  calculus 

yieldini^ 

aMto(0--g,1(06'(«)vj  (3.9) 

Substituting  tins  value  of    u  (t)     into  equation  (i.  8)  yields  the  Hamilton- 
Jacob  ii, 


vt  +  i    E<c)-H(«)*(o||ao  -i||vj|»    .m  -  +v;e(Oi(«)-o 


The  solution  ior  this  equation  can  be  obtained  by  assuming 


Het 


V(x(t).t)--||x(t)||^t)4x'(t)x(t)+a(t) 


V,  --||i<0||]      +£'(t)t(t)  •  i(t) 
2  £(*> 


V    -P(t)ft(0  ►  «(t) 


(3.  10) 


(3.  11) 


12) 


(3.  I     ) 


After  substituting  these  expressions  into  equation  (3.  10),    we  obtain 

rll£<0|l?       +£  (t)i(O  +a(t)+-||z(t)-H(t)x(t)l|2 
2  P(»)  2  Q(») 

(3.14) 

-5-!!P(t)x(t)+x(t)||2  +[P(t)i(t)  +  x(t)]'F(t)£(t)=0 

2  2(')R     (t)G  (t) 

This  equation  can  be  satisfied  for  all    x  (t)    if  and  only  if  the  following 
equations  are  satisfied: 

P(t)  =  P(t)G(t)R-Vt)(,'(t)P(t)-P(t)F(t)-F'(t)P(t)-H'(t)Q(t)H(t)  (3.  15) 

i(t)  =  [P(t)G(t)R-1(t)G'(t)-F'(t)l  x(t)  +  H'(0Q(t)z(t)  (3.  16) 

(3.  17) 


a(t)=I||x(t)!|2  ,         ,      -I||2(t)||2 

2        "  'G(t)R-'(t)G  (t)        2  Q(«) 


The  boundary  conditions  for  these  equations  can  be  obtained  from 

equations  (3.3)    and    (3.  11).     They  are 

P(T)  -0  (3.  18) 

i(T)-0  (3.19) 

a(T)-0  (3.20) 

Here  again,    these  equations  must  be  solved  backwards  in  time,    but  they 
do  not  depend  on  the  State  ol   the   system.    Therefore,    they  can  be  pre- 
computed,    as  in  the  discrete  time  case  if  the  desired  output,      z  (  r),      is 
known  on  the  interval     t  <  r  <   T.     The  control  can  be  realized  in  the  form 
of  the  block  diagram  shown  in    figure    i .   1. 


G(t)  I K~) — ► 


Figure  3.  1   -  Continuoiu   I  Lme  Optimal  Linear  Control  Syiti 
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3.  3      Nonlinear  Systems 


The  nonlinear  systems  we  consider  here  can  be  described  by  the 
equations 

i(t)=!(x(t),u(t),t);        x(0)  =  c  (3.21) 

y(t)=h(x(t),t)  (3.22) 

where     x  (t),     u_(t),     and  _y_(t)     are  state,    control,    and  output  vectors,    as 
before,    and  where     f(x(t),  u  (t),  t)     and    h(x(t),  t)     are  continuous  time 
vector  valued  functions.    It  is  necessary  to  assume  that    f    and    h 
satisfy  certain  differentiability  conditions  in  what  follows.    Whenever 
derivatives  of  these  functions  appear,    we  will  tacitly  assume  that  they 
exist. 

For  the  system  just  described,    we  wish  to  find  the  control,      u(r), 
on  the  interval,      t   <    t  <  T,      such  that  the  performance  index 


"*-/  [jHiW-tMllJcn  +  jll 


+  -!lu(r)!'J 
(T)     2  5  en 


At 


(3.23) 


is  a  minimi. 

We  define  the  value  function 


Mm 


V(i(0.0-      u(T)       Ijfol 

t  <  T   <T 


(3.24) 


Then  by  the  principle  of  opt  i  mality, 
Mm  t     ft*-     I 

V(x(t),t) 


r  <  t  +  Ae  \  «  L 


)-I<r)|i;|n+|llE(r)ir*<fJ 


(It 


(3.2S) 


+  V(x(t+  At).  t+  At)  i 

By  expanding     V  (x  (tf  At),     t+At)     in  a    Taylor  series  about    x(t)     and    t, 
and  then  taking  the  limit  as     At    approaches    0,     we  get 


Mm 


v,+ 


u(t) 


jlllW-tWll^  +  illiWH^+v^w 


R(t) 


=  0 


(3.  26) 


Since  the  system  Is  nonlinear,    we  cannot  solve  this  Ha  milton-Jacobi 

tion  directly  in  general.    So,    as  in  Chapter  II,    we  resort  to 
.i  ri/.at'.on.    We  use  the  approximations 
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x  (t)  i  !<«•  (t),  u"  (t),  t)  +  fjx*  (t),  u«  (t),  t)  [  x(t)  -  x»  (t)]  +  f  (^  (t),  u«  (t),  t)  [ u(t)  -  u*  (t)]    (3.27) 
and 

y(t)-  h(x«(0,t)  +  h   (x«(t),t)[x(t)-x*(t)]  (3.  28) 


With  these  approximations,    equation  (3.  26)  becomes 

In 2  I 


Mia     | 


V         -lli«-l<Oll       f-Mu(i)||       +v;i  +  v;fjx(t)-x-(t)) 

u(t)  /  2(,)    2  aw     -  j3<  29) 

+  v*  i  [u(t)  -u*(t)r  -o 

The  minimization  operation  yields 

I.M.C0—  rw.;va  (3.30) 


and 


V  +I  |iE(t)-hiJL(«)-t(i)  I!  "      -  ill  V,'i;  +  V£'[f^(t)  +  fe(t>]-0      (3.31) 

*■  —  ■  u  ~  u 

where 

wo-i-f.i'w-i^srw  (3.32) 

and 

c(t)=h-hix*(t)  (3.33) 

A  solution  for  equation  (3.  31)  can  be  obtained  by  assuming 

V(i(t),t)--||i(t)||'      +x'(t)x(t)  +  a(t)  (3.34) 


which  implies 


V(i(t),t)--      t(t)||J        +£(t)x(t)  +a(t)  (3.35) 

«   ~  ■>  P(t) 


J(t) 
and 

V  (i(t),0-P(t)i(t)+i(t)  (3.36) 

Combining  equations  (3.  Jl),    (3.  i5),    and  (3.36)  yields 

-||*<0||J        *-&'(t)StO  +  *">  ^;!lz(t)-h5xft)-c(t)|jJ)(t)-il|P(t)x(t)+x(t)l!f2    rl(t)f/ 
t-[P(t)t(t)+i(t)]'[f   s.(t)  +  fe(t)l  - 0  (3.37) 


41 


XI 


31 


E 


o 

c 
o 
U 

L» 
T) 

U 

c 

•  — * 

o 

U 

E 


3 

0 
3 
C 
••^ 
•■■• 

c 
o 
U 


CJ 
CO 

3 


43 


This  equation  will  be  satisfied  for  all    x  (t)    if  and  only  if  the  following 
equations  are  satisfied: 

tW-EWyWI^fiW-PWI^-IiEW-kigWh,  (3. 38) 

x(t)  =  [P(t)fuR-1(t)r-f'J  i(t)-P(t)b(t)+h^Q(t)[z(t)-c(t)]  (3.  39) 

a(0  =  \  !lk(t)|i;  R.         ,  -l-  ||z(t)  -fi«ll"       +  b'(t)x(t)  (3.  40) 

The  boundary  conditions  are  the  same  as  for  the  linear  case. 

If  we  are  given    x  (t)     and    u  (t),     we  can  compute    P  (t)     and    x  (t)    in 
advance.    Then  these  parameters  can  be  used  to  determine  a  near  optimum 
control  for  the  system.    Of  course,    how  near  optimal  the  control  system 
is  depends  on  how  good  the  approximations  (3.  27)  and  (3.  28)  are. 
Computationally,    we  can  proceed  in  a  manner  analogous  to  the  discrete 
time  iterative  procedure.    To  do  this,    we  can  use    x  (t)    and    u  (t) 
determined  by  the    itb    iteration  as    x*(t)    and    u*(t)    for  the    i+ls/    iteration. 
Similar  to  the  iterative  algorithm  of  section  2.4,    this  algorithm  can  be 
shown  to  yield  the  exact  solution  to  the  continuous  time  nonlinear  control 
problem. 

The  control  system  can  be  synthesized  in  the  form  of  the  block 
diagram  of   figure  3.  2.    As  can  be  seen  from  figure  3.  2,    the  continuous 
time  control  system  is  almost  identical  in  form  to  the  discrete  time 
nonlinear  control  system. 

Some  additional  insight  into  the  problem  of  optimal  control  can  be 
gained  by  examining  the  nature  of  the  equations  for    P  (t)    and    x(t).      As 
the  quantity,      T-t,      approaches    zero,      P  (t)     and    x  (t)     approach  zero. 
Hence,    the  optimum  control  signal  approaches  zero  as  the  terminal  time 
nears.    On  the  other  hand,    when    T-t    is  very  large,    and  the  system 
being  controlled  is  linear  time  invariant,     P  (t)    is  very  small.    We  would 
expect  that  when     T-t    is  very  large,    and  when  the  time  variations  and 
nonlinearities  of  the  system  being  controlled  are  not  severe,      P  (t) 
should  be  small,    also.    The  director  part  of  the  input,     x  (t),      is  derived 
from  the  desired  output,      z  (t),      by  the  feedback  system  shown  in 
f i gure  3.  3. 
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Figure  3.  3  -  Block  Diagram  of  System  for  x  (t) 

If  the  output  of  this  system  follows  the  input  reasonably  well,    the 
system  synthesized  using    h^Q  (t)  z  (t)    in  place  of    x  (t)     might  perform 
near  optimally,    provided     b  (t)     and    c  (t)     are  reasonably  small  in 
magnitude. 

The  comments  above  have  been  imprecise,    and  were  meant  only 
to  convey  some  insight  into  the  problem  beyond  the  bare  mathematical 
statements. 
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CHAPTER  IV 

CONSERVATIVE  SYSTEMS 

4.  1      Introduction 

A  special  class  of  nonlinear  systems  which  we  shall  call  "conservative," 
can  be  treated  analytically  and  exactly  by  the  methods  introduced  in 
Chapters  II   and    III.     The  purpose  of  this  chapter  is  to  study  this  class 
of  nonlinear  problems  by  means  of  two  examples.    Often,    as  much  can 
be  learned  from  the  study  of  one  analytic  example  as  from  a  hundred 
numerical  examples. 

4.  2      General 

Consider  the  nonlinear  system 

i  =  l(x)  +  u;         i(0)  =c  (4.  1) 

If  the  performance  criterion  is 

J  ■    |     U*.")  dt  (4.  2) 


then  the  loss  equation,    equation  (3.  26),    is 

[L(x,u)  +  V  f(x)  +  V'ul  =0  (4-  3) 

u(t) 

If  the  term,      VJ  f  (x),      in  equation  (4.  3)  vanishes  identically  for  all    x, 
it  is  possible  for  a  great  simplification  to  result.    Of  course     V,      and 
hence     V    ,      depend   strongly  on  the  form  of     L  (x,   u).      Thus     V'f(x)     will 
vanish  only  if     L  (x,   u)    has  a  special  form.    Fortunately,    this  is  sometimes 
th<    case  in  practical  problems.    Th<  nple  problems  which  follow  will 

serve  to  illusl  the  nature  of  the  special  form    L  (x,  u)    must  have  to 

permit  this  simplification.    In  addition,    the  example  problems  will  permit 
us  to  study  the  analytic   solutions  of  some  optimal  nonlinear  control 
problems,    and   compare  them  with  some   sub-optimal  solutions. 
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4.  3      Spinning  Body  Problem 

The  equations  of  motion  for  the  angular  velocities  of  a  freely  spinning 
body  about  three  mutually  perpendicular  axes  can  be  written  as 

) 

where    xx,     x2,      and    x       are  the  angular  velocities,    and  where 


*xm*xx**        xi(0) 


J,—,*,**       ^o)-*-  <4-4> 


--3*1*2  X3(0)=C3 


x,  •=  a  ,x  .x  . 


a,  +a2  +a3  -0  (4.  4A) 

These  equations  of  motions  are  coupled  and  nonlinear.  If  we  wish  to 
control  the  spin  of  this  system  by  exerting  torques  about  each  of  the 
three  axes,    the  equations  of  motion  become 

xi  ~flix2x3  +  u>;         *i<°>-ctJ 

k2  =  a2X.X3+U2;  X2(0)^C2,  (4>5) 

X3"a3X1X2  +  U3;  X3(0)-C3* 

where    u    ,     u    ,      and  are  the  control  variables  proportional  to  the 

12  3  r        r 

torques. 

If  we  wish  to  reduce  the  angular  velocities  to  a  minimum,    subject 
to  a  constraint  of  the  control  effort  expended,    an  appropriate  performance 
criterion  might  be 

j-J      Iq(t)  [«J +  *J+«;]+ If  (t)  [«1+»J+«j]|  dt*  (4.6) 


Optimal  Control 


The  control  which  minimize  a    J     can  be  found  by  the  method  of 
Chapter  III.    The  loss  equation  is 

Kh"     U  ...  r ,     ,     ,i    i 


+  V     a,x2xJ  ♦  V     «2xlV,  I  V    I^I.  +  V    B,  -  V     »,  +  V    « 

1  2  3  12  11 


(4.7) 


The  spinning  body  control  problem  has  been  treated  by  A  than  s 
and  Windeknechti21      bul     beir  methods  differ  from  that  used  here  Ln 

_',niticant  i  3. 
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If  we  assume 


V=Ip(t)[xJ  +  x^x23] 


then  the  optimal  control  is 


where 


u     =-k(t)x. 
k(t)  =  p(t)/r(t), 


(4.8) 


(4.9) 


and  equation  (4.  7)  becomes 

-  lp(l)  -p2(t)/r(t)  +  q(t)l    [x2  +  x2  +  xf) 


(4.10) 


(4.  11) 


But  since  this  equation  must  be  true  for  all    x    ,    x   ,     and   x    ,     we  must 

have 

p(t)-p2(t)/r(c)  +  q(t)=0  (4.12) 


From  the  definition  of    V,      the  boundary  condition  is 

p(T)-0 

If    q    and     r    are  constant,    the  solution  for  equation  (4.  12)  is 


p(r) =rk(r)  =  ta 


1  -c 


-2  (IT 


1    +C 


-2  ar 


where 
and 


a  -  \>\     r 


T   =  T  -  t 


(4.  13) 

(4.  14) 

(4.  15) 
(4.  16) 


A  plot  of    p  (  t)    is  shown  in  figure  4.  1. 
p(r)/ra 


.0  2.0  3.0 

ire  4.  1   -  Plot  of  p(r)/ra  V( 


Notice  that  the  optimal  controller  is  linear  with  time  varying  gains 
even  though  the  system  controlled  is  nonlinear.    Also  notice  that  the  time 
varying  gains  reach     76  percent    of  their  steady- state  value  in     r  =   l/a 
seconds,      95  percent    in     t  =  2/a    seconds,     and    99.  5  percent    in 
t  =  3/a    seconds.      As  is  evident  the  quantity,      l/a,      plays  the  role  of 
a  time  constant. 

The  controller  may  be  realized  in  the  form  of  the  block  diagram  of 
figure  4.  2. 


Figure  4.  J  -  Spinning  Body  Control  System  Block  Diagram 

Sub -optimal  Control 

It  is  instructive    to  compare  the  optimal  control  system  of  the  last 
tion  with  the  sub-optimal  control   system  which  simply  uses  constant 
gains.    In  order  to   make   this   lomparison,    the-  performance  criterion 
must  be  computed  for  the  optimal  and  sub-optimal  controls  on  the  time 
interval   [0,  T]. 

For  the  optimal  control  th<-  per]  rite r ion  is 

r-V(c,n>  (4.17) 


or i    for  thi  )li  m, 


J     „-M 


-2  aT" 


1    +C 


[«;+«!♦«;] 


(4.  18) 


For  the  sub-optimal  control  with 


u2  =  -kx2 


u3  "  "kx3 


the  performance  criterion  is 


-rM/R 


2  2  2 


dt 


or 


where 


J  -r 


2    L 


q  + 


'fT 


Wit)  Ht 


W(t)  =  xj(t)  +x]{t)  +  xjt) 


(4.19) 


(4.  20) 


(4.21) 


(4.  22) 


It  is  possible  to  compute     W(t)     from  equation  (4.  5)  in  the  following 
manner: 


•|*1    -aiX2X3Xl    "kX 


*  .*  -  ■  a,x  ,x    x  ,  -  kx  , 

2     2  2     13     2  i 


X3*  '      V.X2X3-LX3 


1 

: 


(4.23) 


Adding, 


[«;♦«;♦«•] --k[«; +,;  +  «»] 


or 


2    .    „2  j    „2 


Witt   l  Jk  W  ( r )  -  0  W(0)  -  c(  +  c*  +  c* 


(4.  24) 
(4.25) 


The  solution  of  equation  (4.  25)  is 

W(t)  -  c2M  W'(0) 


(4.  26) 


From  this  the   sub-optimal  performance  criterion  may  be  computed.    This 
gives 


I      ^(.-.-'"l[cj  ••]  ,4.27, 


For    k  =  a  =  vq/r,      J  aes 


I  Til    I     1     -  C 

2 


*][«;  +  ■ 


(4.  28) 


The  ratio,      j/j  ,      is  then  simply 

J/J*  -  1  +e-2aT 

A  plot  of    j/j*     is  shown  in   figure  4.  3. 

J'J* 
2.0 


(4.  29) 
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Figure  4.  a  -  ]']'  versus  a  T 

The  maximum  value  of    j/j*     is     2    when     T,      the-  control  interval,    is 
infinite  d,    and  the  ratio  tends  to  unity  as     T    increases.    In  fact,    when 

T    is  just     l/a     seconds,    the  ratio  is  only   1.  13. 

To  get  an   Lnd  n  of  the  sensitivity  of  the  performance  index,    we  can 

compute  the  sub-optimal  control  with     k  =  ~  a    and  compare  the  results  with 
those  for    k  =  a. 

The  value  of  the   performance  index  for     k  -  ra     is  given  by 


I 


M 


la 


in -*«][«:+«;♦«;] 


or,    since 


a  =     V q/r  , 


j-lr«[i-e-T][c;  +  c;H 


(4.30) 


(4.31) 


C   this    ease-,     the    ratio,       j/j    ,       is 

.       5    fl  -eaT|[l    ^-iaT) 

A  plot  ot  thifl   ratio  is  also  shown  in    figure  4.  3. 


(4.32) 


■I 


We  can  see  trom   figure  4.  3    that  the  constant  gain  sub-optimal  control 
provides  a  nearly  optimal  system.    The  gain  setting  with    k  =  a    would  be 
better  if  the  control  interval  is  much  greater  than     l/a     and    k  =-    a 
would  be  better  if  the  control  interval  is  much  less  than     l/a.     In  any  case 
the  system  is  relatively  insensitive  to  variations  in  the  gain  setting,    and 
this  is  the  reason  that  the  optimal  control  system  is  little  better  than  the 
constant  gain  sub-optimal  systems. 

Terminal  Control 

If  we  desire  to  reduce  the  angular  velocities  of  the  spinning  body  to 
a  minimum  at  the  terminal  time  only,    subject  to  a  constraint  on  the 
control  effort  expended,    an  appropriate  performance  criterion  might  be 

j-jq  [«;cn+«j(T)+«;(T)] +j    Lr  [■;♦■•♦■;]*  (4.33) 

The  results  of  the  sub-section  on  optimal  control  apply  directly  to 
this  problem  if  we  let 

q(0  =quo(t-T)  (4.34) 

where     U  (t)     is  the  unit  impulse-  function. 
The   equation  for     p  (t)     then  is 

p(t)-pa(t)/r-0;         P(T)=q  (4.35) 

The  solution  of  equation  (4.  J5)  is 


i_  (4.36) 

1  - 

where  again 


p(0- 

1  +  aT 


r-T-i  (4.37) 

and 

«  =  q/r  (4.38) 

Thus  the  optimal  value  of  the  performance  index  is 

r      1     q*(0>  (4.39) 

i    li«T 
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A  plot  of    p  (t)     is  shown  in  figure  4.  4, 


.0  2.0  3.0 

Figure  4.  4  -  Plot  of  p(t)/q  versus  at 


4.0 


a  r 


Again,    it  is  interesting  to  compare  the  optimal  controller  with 
a    sub-optimal    constant    gain    linear    controller.     In  terms  of  the 
constant  gain,      k,      the  performance  criterion  for  the  sub-optimal 
controller  is 


j   =IqW(0)   re-2kT-irke-2kl7q  +  -rk/q 
2  L  2  2 


(4.40) 


If  the  gain,      k,      is  set  equal  to    a,      the  sub-optimal  performance 
criterion,      J,      approaches  the  optimal  performance  criterion,      J    , 
for  very  short  control  intervals.    In  this  case,    the  ratio,      j/j    , 


is 


j/j«  =L(i  +aT)(e-2aT  +  I) 
2 


(4.41) 


A  plot  of    J/J       is  shown  in  figure  4.  5, 
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Figure  4.  5  -  Performance  Ratio,    J/J*,  for  Terminal  Control 

It  should  be  noted  that  the  value  for     k    in  this  example  was  chosen 

to  give  near  optimal  performance  over  relatively  short  control  intervals. 

Better  performance  could  be  achieved  over  longer  control  intervals  with 

a  lower  gain  setting.    For  instance  if    k  -  -  a,      the  value  of  the  performance 

criterion  is 


J  --qW(O) 
2 


rui.-* 


(4.42) 


The  ratio,      j/j*  ,      then  is 


J/J*  -1(1  +aT)(l  +3e'aT) 


(4.43) 


A  plot  of  this  is  shown  m    timire  4.  5    also.    A  plot  is  also  shown  for 
k  =  \   a  . 

As  can  be  seen  from  the  plot,    there  exists  a  constant  gain  for  any 
particular  value  of  control   interval  which  will  give  very  nearly  optimal 

tormance.    For  instance,    with  a  control  interval  of     l/a  ,      k  =  ^   a 
will  give       pi  pformance  index  of  about     1.  OS     times  the  true  optimal 
rformanci    I  rxd< 
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4.4      Nonlinear  Spring  Problem 


The  equation  of  motion  for  a  mass  attached  to  a  cubic  spring  can  be 
written  as 

i  +  x3=0  (4.44) 

or  if  control  is  exerted,    i.  e.  ,    the  system  is  forced,    the  equation  is 


X   *■  X       =  u 


(4.45) 


This  equation  can  be  written  as  the  system  of  first  order  equations 


*»-*,  +  ■ 


where 


and 


2  12 


Xl'X 


U"U1    +U2 


(4.46) 
(4.47) 

(4.48) 


The  state  variable,      x    ,      is  not  as  easily  identified  with  the  original 
system  variables,    but  this  is  of  little  consequence. 

Suppose  that  we  wish  to  control  the  system  (4.46)  such  that 


'-/[•(K-H) 


♦•(«*,■!♦  j-T 


dt 


(4.49) 


is  a  minimum.    The  loss  equation  for  this   system  is 

Mm 


V,+ 


uru2 


q(r«*H) f r(i?u'+r')+Vi.(X2+u,)+ViJ("x;+Ua)  ~°  (4-50) 


If  we  assume 


then  the  optimal  conti 


u  ,  -  —  p(t)x,/r(t)  j 

»2  -  -p(t)x2/r(t)       \ 
and  equation   (4.  SO)   b<  COmi 

|P(t) -Pa(o/r(t)  +  q(o"j   Uj+-«;l-o 


(4.51) 


(4.52) 


(4.  S3) 
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Since  this  equation  must  be  satisfied  for  all  values  of    x       and    x   ,     we 
must  have 

p(t)-p2(t)/r(t)+q(t)=0  (4.54) 

The  boundary  condition  is 

P(T)=0  (4.55) 

This  equation  is  identical  to  equation  (4.  12),    and  the  results  of 
section  4.  3  of  this  chapter,    including  the  sub-optimal  control  results, 
are  equally  applicable  to  this  problem. 

Since 

u=i1  +  u2  (4.56) 

the  control,      u,      may  be  expressed  as 

u  =--P(t)i1/r(t)  -P(t)x2/r(t)  (4.  57) 

For  the  actual  synthesis  of  the  controller,    however,    this  expression 
for    u    is  unsatisiactory  because  the  state  variable,      x   ,      has  not  been 
id<  d  with  the  original   system  variables.    We  can  get  around  this 

by  <  ,     in  terms  of    xt      and     u    ,      thus 

,    «i    -u  (4.  58) 

or 

za-kl  +  -p(t)xl/r(t)  (4.59) 

The  control,     u,     then  is 

B-3p(0i,    2r(t)  +^P:(«)xI/r2(t)  (4.60) 

The  block  diagram  for  this  control  system  is  shown  in  figure  4.6. 
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Figure  4.  6  -  Nonlinear  Spring  Control  System  Block  Diagram 


Admittedly,    the  nonlinear  systems  and  the  performance  criteria 
used  in  this  example  problem  and  the  previous  one  are  very  special. 
However,    because  we  are  able  to  obtain  analytical  solutions,    a  great 
dual  of  insight  can  be  gained  from  them  about  the  nature  and  behavior 
of  optimal  controllers  Ln  nonlinear  systems.    In  particular,    we  have 
found  that  Bimple  constant  gain  linear  controllers  can  provide  very 
ne.  timaJ   performance  over  a  wide  range  of  conditions. 
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CHAPTER  V 
COMPUTER  RESULTS 
5.  1      Introduction 

The  results  of  several  computer  problems  illustrating  the  methods 
of  Chapter  II  are  presented  in  this  chapter.    Several  variations  of  each 
problem  are  presented  in  order  to  show  the  effect  of  changes  in  the 
initial  state  and  changes  in  the  performance  index.    It  should  be  borne 
in  mind  that  since  the  system  being  controlled  is  nonlinear,    the 
controller  parameters  depend  on  the  initial  state  of  the  system. 

In  addition,  the  results  of  controlling  some  of  the  nonlinear  systems 
with  simple  sub-optimal  linear  controllers  are  presented  and  compared 
with  the  optimal   results. 

The  results  of  this  section  were  obtained  on  the  IBM  7090  computer 
at  the  MIT  computation  center.    The  Fortran  programs  used  to  obtain 
the  solutions  for  the  two  state-variable  deterministic  problems  are 
given  in  Appendix  D.    In  all  cases,    the  change  in  the  performance  index 
from  one  it.  ration  to  the  next  was  used  as  a  convergence  criterion. 
When  the  magnitude  of  this  change  was  less  than  one  per  cent  of  the 
value  of  the  performance  index,    the  iteration  procedure  was  terminated. 

5.  2      One  State-Variable  Example 

The   system  considered  tor  this  example  can  be  described  by  the 
equations 

x(k  +  l)  =  x(k)  -0.05x3(k)  +0.05u(k);  x(l)=c  (5.1) 

y(k)  -z(k)  (5-  2) 

The  system  may  be  thought  of  as  the  discrete  time  approximation  of  the 

continuous  time  system 

i(t)  =-x3(t)+u(t);  x(0)=c  (5.3) 

y(t)  -  x(t)  (5.  4) 
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The  performance  index  used  was 


too  9  9 

J  =    V  -  Q(k)[z(k)-x(k)]2  +    V-R(k)u 


:(k) 


(5.5) 


The  equations  used  as  the  basis  of  the  iterative  procedure  for  this  problem 
may  be  determined  from  equations  (2.  23),    (2.  27),    (2.  31),    and  (2.  32). 

The  sub-optimal  system  used  is  given  by  the  same  equations  except 
that    u  (k)     is  given  by 

u(k)  =  G(k)  [z(k)  -x(k)l  (5.6) 

where     G    is  a  constant  gain  factor.    Block  diagrams  of  the  optimal  and 
the  sub-optimal  control  systems  are  given  in    figure  5.  1. 
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Figures  S.  2  through  5.  10  give  thr  plotted  results  from  several  data  sets 
ii.    Comments  on  each  ol  the  figures  are  given  below. 
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Figure  5.  2:     For  this  data  set,      R(k)  =  0.01,     Q(k)  =   1.  00,     x(l)=1.00, 
and    z(k)  =  0.  0.      Convergence  was  achieved  in  three  iterations.    The 
linear  sub-optimal  control  system  with  a  gain  equal  to     7.  5    gave  a 
performance  index  of    1.  1951,     just    0.  1  per  cent  higher  than  the  optimal. 

Figure  5.3:     For  this  data  set,     R(k),     Q(k),     x(l),     and    z(k)    are  the 
same  as  for  the  previous  data  set  except  that    z(k)  =1.0    for    k>   50. 
Convergence  occurred  in  three  iterations.    The  plot  clearly  shows 
that     u(k)     anticipates  tht-   step  in     z(k)     indicating  the  sense  in  which 
this  control  system  is  "unrealizable.  "     The  sub-optimal  control 
system,    which  is  non-anticipative,    with  a  gain  of    7.  5    had  a  performance 
index  of    3.  238,      about    30  per  cent  higher  than  the  anticipative  optimal 
system. 

jure  5.4:     For  this  data  set,      R(k)  =  0.0  1,     Q(k)  =    10.0,     x(l)  =   1.0, 
and    z(k)  -  0.0.      Coir  as  achieved  in  three  iterations.    Notii 

that  since  output  error  is  relatively  more  important  in  this  case,    the 
control  effort  used  is  hi  md  the  system  response  is  faster.    Tin 

sub-optimal  control   for  this  set  had  a  gain  of     15.0     and  gave  a    perform- 
ance index  of    6.  5k6,      about    0.  1  per  cent     higher  than  the  optimal. 

Fij  .5:     For  this  t,     R(k)  =  0.01,    U(k)  =   10.  0,     x(l)=1.0, 

'1     z(k)  -  0.  0     for     k    <    50,      but     z(k)  =1.0     for     k   >    50.      Convergence 
wa  leved  in  two  i\  .    With  a  gain  of     15.0,    the  sub-optimal 

system  gave  a  perfori  x  of    13.  845,     which  is     1  i  per  cent 

higher  than  the  performance  index  for  the  optimal  system.    When  the 
ntrol  system  respoi  i  relatively  fast,    as  Ln  this  case,    anticipation 

the  optimal  system  does  not  improve  the  system  performance  as 
mu<  h . 
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I'igure  5.  6:     For  this  data  set,      R(k)  =  0.  01,     Q(k)  =   10.  0,     x(l)  =   10.  0, 
and    z(k)  =  0.  0.     When  the  initial  state  is  as  large  as  it  is  in  this  case, 
the  system  is  open  loop  unstable.    (The  continuous  time  system,      x   +  x3  =  0, 
is  always  stable,    but  the  sampling  introduced  to  make  the  discrete  time 
approximation  causes  the  system  to  be  unstable  for      x(l)      greater  than 
about    6.  0.      The  closed  loop  control  system  is,    nevertheless,    stable,    at 
the  expense  of  a  very  large  performance  index.    Because  the  discrete 
time  system  is  unstable,    it  is  not  a  good  representation  of  the  continuous 
time  system  for  this  case.    For  this  reason    figures  5.  7    and   5.  8   have 
been  included. 

Figure  5.  7:     For  this  data  set,    the  sampling  interval  has  been  decreased 
by  a  factor  of     10     and  the  number  of  steps  has  been  increased  by  a  factor 
of     10.      This  makes  the  system  open  loop  stable,    and  once  again  a 
reasonable  discrete  time  approximation  to  the  continuous  time  system. 
Here     R(k)   =  0.01,     Q(k)  ^    1.0,     x(l)   =    10.0,      and     z(k)  =  0.0.      Convergence 
occurred  in  six  iterations. 

Figure  5.  8:     For  this  data  set,    the  comments  of  the  previous  set  apply 
except  that    Q(k)  =   10.  0.      Convergence  occurred  in  four  iterations. 

Figur.  i  data  set,      R(k)  =    1.0,     Q(k)  =    1.0,     x(l)  =    1.0, 

and    z(k)  =  0.0.      Convergence  occurred  in  three  iterations.    Because 
the  cost  of  control  is  so  hi^h  relative  to  the  cost  of  output  error,    the 
control  effort  expended  is  small  and  the  system  response  is  slow.    As 
a  matter  ol  fa<  t,    it  can  be  shown  that  for  the  one  state-variable  system 
the  speed  of  response  is  proportional  to  the  ratio,      Q(k)/R(k).      In  general, 
xpect  the  spied  of  response  to  depend  on  the  ratio  of  the  norm 
the     Q(k)     matrix  to  the     R  (k)     matrix. 
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Figure  5.  10:     The  data  for  this  set  is  the  same  as  for  the  last  set  except 
that    z(k)  =  0.0     for    k    <    50     and    z(k)  =   1.0    for    k  >    50.      Convergence 
occurred  in  four  iterations. 

5.  3       Two  State- Variable  Examples 

The  system  considered  for  the  first  two  state-variable  example  can 
be  described  by  the  equations 

K1(k+l)-x1(k)+0.01xa(k);        i^D-c,  (5.7) 

x  2(k+l)  -  x2(L)-  0.02  x,(k)-  0.03  |x2(k)|  x2(k)+0.01u(k);  K,(l)-Ca  (5.  8) 

yi(k)»x,(k)  (5.9) 

y2(k)=x2(k)  (5.10) 

A  block  diagram  of  this  system  is  shown  in    figure  5.  11.    The  system 


0.03'x2(k);x2(k) 


Figure  .S.  11  -  Two  State-Variable  Nonlinear  System 

described  above  may  be  thought  of  as  the  discrete  approximation  for 
continuous  time  system 

x(t)  f3  |i(t)|  i(t)  ♦  2x(t)  -a(t) 


y  ,(t)  -  *(t) 
y2(t)  -i(t) 


(5.  11) 
(5.  L2) 
(5.  13) 


Th>  mance  Index  used  was 


j  .    V"    -iQ/lolz/k)  -xt(k)f  +y2(k)[z2(k)-x2(k)|2|  +   J^  -  R(kju2(k)         (5.  14) 

w  -  I  U  =  1 
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The  equations  that  form  the  basis  of  the  iterative  routine  follow  from 
equations  (2.  23),    (2.  27),    (2.  31),    and  (2.  32).    The  equations  for  the 
sub-optimal  systems  are  the  same  except  that 

u(k)  =G,  [2l(IO  -XjOOl  +  G2  [z2(k)  -«2(k)l  (5.  15) 

where    G1     and    G       are  constant  gain  factors. 

Figures  5.  12  through  5.  21  give  the  plotted  results  from  ten  data 
sets  for  this  example.    Comments  on  these  figures  follow. 

Figure  5.  12  -   5.  14:     For  these  data  sets,      R(k)  =  0.01,     Q^k)  =   1.00, 
Q2(k)  =   1.00,     x,(l)  =  0.0,     Zj(k)  =  0.0,      and    z2(k)  =  0.0.     In   figure  5.  12, 
x2(l)  =   1.0,      in  figure  5.  13,      x2(l)  =  3.0,      and  in  figure  5.  14,      x2(l)  =   10.0. 
For  each  of  these  convergence  occurred  in  three  or  four  iterations.    The 
sub-optimal  control  with    GJ   =  8.  5     and    G2    =  4.  75     gave  performance 
indices  of    5.360,      31.32,      and     158.62    for    x,(l)  =  1.  0,    =3.0,      and 
=   10.0,      respectively.    The  sub-optimal  control  system  performance 
indices  were     17.  5,     7.  0,      and     7.  5  percent  higher  than  the  optimal 
performance  indices. 

Figures  5.  15   -  5.  17:     For  these  figures,    the  data  were  the  same  as 
for  figures  5.  12  -  5.  14    except  that    z  f(k)  =  0.0     for     k   <    50     and 
Zj(k)  =1.0     for    k    >    50.     In  each  case  convergence  occurred  in  three 
iterations. 

Figure  5.  18:     For  this  data  set,      R(k)  =  0.01,     Q  (k)  =  1.0,     Q  (k)  =  0.0, 
Xj(l)  =  0.0,     x2(l)  =    1.0,     z,(k)  =  0.0,      and     z2(k)  =  0.0.      Convergence 
occurred  in  five  iterations.    The  sub-optimal  control  system  with 
G,     =11.0     and    G      =  2.  0     gave  a  performance  index  of     1.  305, 
ahout    5  pei-  (  «nt  higher  than  the  optimal  system  performance  index. 
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Figure  5.  19:     The  same  data  applies  here  as  in  the  previous  figure 
except  that    z^k)  =  0.0    for    k   <    50    and    z  ,(k)  =   1.0    for    k   >    50. 
Convergence  occurred  in  five  iterations. 

Figure  5.  20:     For  this  data  set,     R(k)  =  0.01,     Qi(k)  =   10.0,     Q2(k)  =   1.  0, 
x  (1)  =  0.0,     x  (1)  =  1.0,     z  (k)  =  0.0,      and    z  (k)  =  0.0.      Convergence 
occurred  in  four  iterations.    The  sub-optimal  system  with    Gi    =  28.  0 
and    G      =   10.  7    gave  a  performance  index  of    5.  71,      or  less  than 
one  per  cent  higher  than  that  for  the  optimal  system. 

Figure  5.  21:     The  same  data  applies  here  as  in  the  previous  figure 
except  that    z  ^k)  =0.0     for    k    <    50    and    zt(k)  =   1.0     for    k    >    50. 
Convergence  was  achieved  in  three  iterations. 

The  system  considered  for  the  second  two  state-variable  example 
can  be  described  by  the  equations 

k ,(1+1)  - 1,00  + 0.01  x,(k)/(l+|s,(k)|);        b^D-c,  (5.16) 

i2(k+l)  -x2(k)  -0.01  1,0c)  +0.01  u(k);  Ia(l)-C,  (5.17) 

yi(k)-x,(k)  (5.  18) 

ra(k)-x,(k)  (5.  19) 

This  system  can  be  thought  of  as  the  discrete  approximation  to  the  system 
described  by  the  block  diagram  below 


i 


The  performance  index  for  this  example  is  the  same  as  that  for  the 
previous  example,    and  the  equations  used  in  the  iterative  procedure  are 
the  same  except  for  the  system  equations. 

Figures  5.  22  through  5.  25  give  the  plotted  results  from  four  data 
sets  for  this  system. 
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Figures  5.  22  -  5.  24:     For  these  data  sets,      R(k)  =  0.01,     Ql(k)=1.0, 
Q2(k)  =   1.0,     x  (1)  =  0.  0,     z  (k)  =    0.  0,      and    z  (k)  =  0.  0.     In  figures  5.  22, 
5.  23,    and  5.  24,      x  (1)  =   1.  0,    3.  0,     and    10.  0,     respectively.    The  number 
of  iterations  required  for  convergence  was  three,    two,    and  two. 

Figure  5.  25:  For  this  data  set,  R(k)  =  0.01,  Q1(k)=1.0,  Q  (k)  =  0.  0, 
x  (1)  -  0.0,  x  (1)  a  1.0,  z  (k)  =  0.0,  and  z  (k)  =  0.0.  Four  iterations 
were  required  for  convergence. 

One  additional  variation  of  this  problem  was  run  in  an  effort  to  get 
some  indication  of  under  what  conditions  the  iterative  routine  might  not 
converge.    For  this  purpose,    the  nonlinearity  was  made  more  violent  by 
changing  the  system  equations  to 

xj(k+l)  =x1(k)+xJ(k)/(1.0  +  10.0  |*a(k)|);         x1(D=c1  (5.20) 

x2(k+I)  =  x2(k)-0.01x1(k)+0.01u(k);  x2(l)  =  c2  (5.21) 

y1(k)  =  x1(k)  (5.22) 

y2(k)  =  x2(k)  (5.23) 

For  each  of  these  data  sets,      R  =  0.  01,     Qt    =    1.  00,     Q2    -    1.  00, 
and    x  (1)  -  0.  00.      For  the  data  set  with    x  (1)  =   10.  0,    convergence 
occurred  in  five  iterations.    For  the  data  set  with    x2(l)  =  3.00, 
convergent  e  occurred  in  four  aerations.    For  the  data  set  with    x  (1)  =   1.00, 
convergence  occurred  after  some  rather  severe  oscillations  in  the  con- 
vergence criterion,    and  then  only  after     19    iterations. 

Tl  rgence  was  slower  when  a  small  initial  condition  was  used 

probably  because  in  this  case  the  system  spent  more  time  operating  in 
the  highly  nonlinear   regions. 

We  can  conclude  from  this  variation  of  the  example  problem,    that 
when  ti  i    nonlinearity  is  severe,    the  iterative  routine  may  converge 
slowly  or  not  at  all. 

F<  i  of  compai  rates,    the  value  of  the 

pi  rfoi  J,     computed  on  ea<  h  Iteration  has  been  included 


in  most  of  the  preceding  figures.    The  value  of  the  performance  index 
computed  on  the  convergent  iteration  is  denoted  by    J    . 

5.4      Stochastic  Examples 

The  results  of  three  stochastic  examples  are  presented  in  this 
section.    In  each  of  these  examples,    the  nonlinear  system  being 
controlled  is  disturbed  by  a  random  input. 

The  computer  algorithm  that  was  used  is  outlined  below. 

Step  1.      Using    P  (k)  =  0,     x  (k)  =  0,      and     r_(k)  =  0,    the  control  and 
the  state  variables  are  extrapolated  ahead  to  determine    u_(l),   .   .  .  , 
u(99)    and    x(2),   ....     x(100). 

Step  2.      Usin^  the     u  (k)     and  the     x  (k)    just  determined,     P  (99).   .  .  .  • 
P(ll)    and    x  (99),  ....     x(ll)     are  computed  by  backward  recursion. 

Step  3.      The  control,      u  (1),   .   .   .  ,     u_(10),      and  the  state,    x(2),   .  .  .  , 
x(ll),      ar<  iputed  with     r(l),   ....     r(10)     taking  on  random  values, 

simulating  the  actual  evolution  of  the  nonlinear  system. 

Step   1        Using    P  (k)    and    x  (k)    previously  determined,    and 
r  (k)  a  0,      the  control  and  the  state  variables  are  extrapolated  ahead 
to  determine    u(ll),  .  .  .  ,    u(99)    and    x(12),  .  .  .  ,    x(100). 

Step  5.      Usin^  the     x  (k)     and  the     u  (k)    just  determined,      P(99),   .   .   .  , 
^(21)     and    x  (99)i   •   .   .  ,     x(21)     are  computed. 

Steps  J,    4,    and  5    are  then  repeated,    starting  at    k  =   11,     k  -   21,     etc., 
until  the  actual  simulation  has  evolved  to    k  =   100.      The  system  should 
be  visuali/.ed  with  steps,  4   and  5   simulating  the  controller  in  fast  time, 
and  step   5  simulating  the  a<  tual  evolution  of  the  nonlinear  system  in 
:1   tune. 

Figure  5.  26:     Th<    resull  en  in  this  figure  are  for  the  example  using 

the  system  <>i   S<  i  tion  5.  2,    but  with  an  independent  random  disturbance, 
r  (k),     added.    For  this  data  set,     R(k)  =  0.01,    Q(k)=1.0,     x(l)  =  1.0, 

(k)       0.0.      ' .    :         that  by  the  time    k  =  21,     tin     1 '    and    x    van  a  hies 
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arc  well  determined  with  no  more  jumps,    indicating  that  despite  the 
random  disturbance,    the  control  system  is  operating  near  optimally.    In 
this  example  and  the  others  of  this  section,      r(k)     is  a  zero  mean,    unit 
variance,    independent  random  sequence. 

Figure  5.  27:     The  results  given  in  this  figure  are  for  the  example  using 
the  same  system  as  the  first  example  in  section  5.  3,    but  with  an  independent 
random  disturbance,    r(k),  added  to  the    x       component.    For  this  data  set, 
R(k)  =  0.01,     Q^k)  =   1.0,     Q2(k)  =   1.0,     Xi(l)  =  0.0,    x^(l)  =   1.0,     z^k)  =  0.0, 
and    z2(k)  =  0.  0. 

Figure  5.  28:     The  rcbults  given  in  this  figure  are  for  a  nonlinear  system 
disturbed  by  dependent  noise.    In  this  example,      x  (k)     represents  the 
dependent  noise  which  is  obtained  froin  independent  noise  by  the  system 

ijOc+1) -0.95 x,(k)  + 0.05 r(k);        x,(l)=0.0  (5.24) 

where     r(k)     is  an  independent  random  variable.    The  state  of  the  nonlinear 
system  being  controlled  is   represented  by     x  (k),      and  is  determined  by 
the  equation 

Ka(k+l)-xa(k)-0.03sJ(k)+O.OSu(k)+0.03x1(k);        x2d)  =  l.o  (5.25) 

Together     x  (k)     and     x  (k)     make  up  an  augmented  two-dimensional  state 
vector.    For  this  data  set,      R(k)   -0.01,     Q  (k)  =  0.0     (as  we  have  no 
control  over  the  noise),      Q  (k)   =1.0,      z  (k)  =  0.0,      and    z  (k)  =  0.0. 
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CHAPTER  VI 


CONCLUSIONS 


The  major  contribution  of  this  work  has  been  the  presentation  of  a 
theory  along  with  an  iterative  algorithm  for  the  solution  of  optimal 
nonlinear  control  problems  subject  to  quadratic  performance  criteria. 
In  addition,    the  results  of  the  computer  examples  presented  in  Chapter  V 
have  demonstrated  the  feasibility  of  the  method. 

A  by-product  of  the  theory  has  been  the  analytic  solution  of  the 
problems  of  Chapter  IV.    In  Chapters  IV  and  V,    comparisons  of  sub- 
optimal  systems  with  the  optimal  ones  determined  by  the  theory  have 
shown  that  often  near-optimal  performance  is  possible  with  simple 
linear  controllers!    a  possibility  that  has  been  suspected  but  not 
demonstrated  previously. 

All  is  not  rosy,    however.    Appendix  C  shows  that  the  method  is 
essentially  limited  to  problems  of  no  more  than  five  state  variables 
and  control  intervals  of  no  more  than     1000     steps  by  the  size  and  speed 
of  |  r itly  available  digital  computers. 

Many  questions  have  been  raised,    but  not  answered.    Of  prime  importance 
among  these  is  the  question  of  under  what  conditions  can  the  convergence 
of  the  iterative  algorithm  be  guaranteed.    Further  research  on  the  problem 
with  stochastic  disturbances  is  required  in  order  to  determine  under 
what  conditions  the  control  procedure  presented  in  section  2.  7  is 
isonali 

It  would  be  highly  desirable  to  be  able  to  rephrase  the  problem  in  such 
a  way  that  the-  optimal  control  system  determined  by  the  theory  would  be 

ed  to  be  non-antit  lpative.    This  problem  has  been  worked  on 
briefly  by  the  author,    but  without  results. 

Finally,    although  it  is  conceivable  that  actual  control  systems  may  be 
-I  by  this  method,    it  is  far  more  likely  that  the  main  use  for  the 
theory  will   be  to  establish  ultimate  performance  figures  for  comparison 
purpo  tudi(  s.    Further  research  in  this  direction   seems 

rranted, 

94 


APPENDIX  A 


TWO  MATRIX  IDENTITIES 
Theorem  Al:     If    R1      and    [R+G'PG)"1        exist,    then 

[I  +  GR-kTP]"1  GR-'G'h  G  [R  +  G'PG]'1  G' 


(A.  1) 


Proof:     The  proof  uses  a  method  of  matrix  manipulations  given  by  Cox. 22 
This  method  views  a  matrix  as  a  linear  transformation  and  shows  that 
such  transformations  obey  all  the  rules  for  block  diagram  manipulation 
provided  order  of  blocks  is  preserved.    In  other  words,    block  diagram 
manipulations  may  be  used  to  prove  matrix  identities. 

For  tlif  proof  of  this  theorem,    it  is  easy  to  show  that  the  expression 
on  the  right-hand  side  of  equation  (A.  1)  can  be  represented  by  the  block 
diagran 


\ 

R-' 

C 

; 

G 

P 

G 

R"'      and   [  R  I   G'PG]"1      exist. 
By  moving    G    into  the  loop  we  get 


->l 

R-1 

G 

J 

I  - 

G 

P 
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Then  moving    R"1     and    G    back  out  the  other  side  of  the  loop  gives 


But  this  block  diagram  is  equivalent  to  the  expression 


[I  +GR-1G'P]"1   GR-'G' 


(A.  2) 


which  proves  the  theorem. 

Theorem  A2:     If    R»     and  [  R  -t-  G'PG  I'1      exist,    then 

[I  +(.R-'r,'pr1     I  -(,  |r  +  G'PG  I"1  G'P 


(A.  3) 


Proof:     The  proof  proceeds  by  using  the  definition  of  an  inverse. 

Thus   if  the   right-hand   side  of  (A.  3)  is  truly  the  inverse  of      I  +  GR*   G'P 

then  we  must  ha 


[j   >  U< ■' (_,'!_'  i   U  -G  [R  +  g'PGj      G'P  I    =  I 


or 


i  .., 


-'  ,- 


(A.  4) 


I  +  GR''(_/P  -G  |R   ♦  G'PG  I      G'P  -GR-'G'PG  [R  +  G' PG  ]"    G'P- I  (A.  5) 


By  regrouping  terms  we  get 


,-i 


i 


J.  ♦  G  IK1  -|  R  >  G'PGl*    -r'g'pg  [R  +  G'PG|"   I  G'P  -i 

But  since    I  R  +  G'PGf  ists,    we  can  wntr 

I+G  |R_I  [R  +G'PG]-  1  -R'g'PG  I  lR  tG'PG]"1  G'P  -  I 


(A.  6) 


(A.  7) 


The  bracketed  term  ia  the  zero  matrix,    hem  < 

I     I 


(A.  8) 


.mil;  that    t-  G[R+  G'PGf    G'P     is  Indeed  the  Lnv<  i         [iveri  Ln  (A.  J>). 
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APPENDIX  B 
STABILITY 

In  the  design  of  any  control  system,    the  question  of  stability  is  of 
paramount  importance.    For  this  reason,    the  stability  of  control  systems 
synthesized  using  the  theory  of  Chapters  II    and   III   is  considered  here 
briefly.    For  simplicity  we  shall  consider  first  the  continuous  time  system 
and  use  the  second  method  of  Lyapunov. 

For  the  unperturbed  control  system  (i.  e.  ,      z(t)  =   0),      the  value 
function  (3.  2.4  )  is  positive  definite,    provided    f  =  0     and    h  =  0    when 
x(t)   -  0_    and    u(t)  -  0_.     In  addition    V(x(t),  t)     approaches  infinity  as    x(t) 
approaches  infinity. 

The  derivative  of    V    with  respect  to  time  along  an  optimal  trajectory 
is   given   by 

\'ix(t).t)  -Vf  +  \£  i(t)  (B.  1) 

or,    by  (3.  19), 

*<i(t),t)--I  i!h,x«t),t)|;^t)-L||  i,ta(0||*(|J  (B.2) 

The   right-hand  Bid*  quation  (B.  I)  is  non-po.sitive  definite.    A  function 

which  posses  these  properties  is  called  a  Lyapunov  function,    and  the 
second  method  of  Lyapunov  states  that  when  a  Lyapunov  function  exists 
n,    the  .^  l  is  stable. 

As  a  matter  of  fact,      V  (x  (t),  t)    is  usually  negative  definite,    although 
it   is  difficult  to  give  general  conditions  under  which  this  is  true.    In  this 
cast    the  Becond  method  of  Lyapunov  guarantees  that  the  Bystem  will  be 
as';  illy  a  table. 

For  the  discrete  time  control  system,    analogous  results  can  be 
drawn  using  a  discreti    version  of  the  second  method  of  Lyapunov. 
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APPENDIX  C 
COMPUTATIONAL  CONSIDERATIONS 
C.  1       Computer  Storage  Requirements 

The  discrete  time  problem  is  analyzed  in  this   section  to  determine 
computer  storage  requirements,    and  in  the  next  section  to  determine  com- 
puter time  requirements.    Because  we  are  attempting  to  get  approximate 
answers,    many  simplifying  assumptions  will  be  made. 

The  first  assumption  we  will  make  is  that  we  are  interested  in 
computing  the  optimum  control  only.    For  instance  we  are  not  interested 
in  computing    a(k).      By  considering  equations  (2.23),    (2.27),    (2.31), 
.uid  (2.  32),    we  can  determine  the  computer  storage  requirements  for 
the  iterative  algorithm  of  section  2.4.    These  requirements  are  given 
in  Table  I. 


Tabic  1 


ibles 

P(k) 

x(k) 
fc(k) 
U(k) 

Total 


\ 

umber 

of  Kt-^i  sk  rs 

K 

ci|iiired 

-n(n  +  l)N 

2 

oN 

n\ 

rN 

-n(n+5)+r   N 


Assuming  a  sin>;l<    Input   .system,    that  is,      r    -    1,      and  for  a  conipuli'r 
with     50,0i)i)     ri  gist<  rs(    the  dimension  of    n     must  be  less  than     5     and 
N  =   1000    in  order  to  lit  the  problem  on  the  computer.    For  the  saim 
nputer  with     N  =    100,      the  dimension  of    n     must  be  less  than     20. 
Even  from  tins  quick  look  into  the  storage  requirements  aspect  oi  the 
problem,    w»-  can  Immediately  sec  that  the  method  La  going  to  be  severely 
r<  itricted  by  the  size  oi  presenl  day  computers. 


C.  2      Time  Requirements 

For  computer  time  requirements,    we  will  determine  the  total  number 
of  mathematical  operations  involved  in  one  iteration  of  the  algorithm.    We 
will  assume  that  all  operations  require  the  same  amount  of  time.    The 
total  time  required  can  then  be  determined  by  multiplying  the  total  number 
of  operations  by  the  average  time  required  per  operation.    In  addition,    to 
simplify  matters  more,    we  will  assume  that  the  input    u  (k)    is  a  scalar 
(i.  e.  ,      r  =    1),      and  enters  in  only  one  component  of  _f. 

Table  II  was  determined  by  examination  of  the  same  equations  as 
were  used  in  determining  Table  I, 

Table  1 1 
Variables  Number  of  Operations 

L'<k)  In(n  +  l)(7n2+3m2)N 

2 

x(k)  D(7na43iBayN 

x(k)  2n(n+ni)N  (estimated) 

u(k)  nn2+2n)N 


[OCA]  Un2+2n(mfl)+i-n(n+3)(7n2+.W) 


N 


As  an  example,    suppose     N  =   1000    and    n  =  m  =   10.      The  total  number 
of  opt  rations  would  be  on  the  order  oi    6   x  10    .     If  the  computer  could 
process,    on  thi  one  operation  every  ten  microseconds,    the  total 

time   require  <1  for  one  iteration  would  be  about  ten  minutes.    Again  ti 
Limitations  of  this  algorithm  using  present  day  comput  ecomes 

plainly  evident. 

As  a  secoi  tmple  suppose    N  -   1000    but    n  -   m       5.       rhen  the 

total  number  of  opt  rations  required  for  one  iteration  would  be  on 
the  order  of    4.  5   x   106.      At  a  computer  sp<  £  one  operation  every 

mdSi    tins  would  require  about     15  seconds     per  Iteration. 
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These  figures  are  somewhat  conservative  because  they  neglect  the 
time  saving  possible  when  repeated  factors  are  encountered.    Neverthe- 
less,   the  figures  agree  in  order  of  magnitude  with  the  times  observed  on 
actual  computer  problems.    (The  actual  computer  times  are  about  one- 
half  to  two -thirds  of  that  predicted.  ) 

From  these  example  problems,    we  can  conclude  that  a  problem  with 
5     state  variables  and    N  =   1000     steps,    represents  about  the  largest 
size  problem  that  can  be  handled  by  this  algorithm  with  presently 
available  computers. 
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APPENDIX  D 
FORTRAN  PROGRAMS  FOR  TWO  STATE- VARIABLE  EXAMPLES 

i9 

-       '    It    ZK  1000)  ,    Z2I 1000) .    U(lOOO)  , 
•     pH  (  :  »    P12<  lnonj  ,     P22(  1O0O) 

•       l1,     <£2.    PH.    P12«    P.1  J.    Fli    F2i    FXlli 
.    R.    01.    02.    I 

•o   i  11*1  tic 

•     "Ml,    ZKIJ.    Z2C1),    R,    01,      .     ,    •  •    .     I  TYPE 
DRfAT     (7F5.     • 

»1«CZ1  C  J  .  (  i  )    -    xi l  i  )  )     +    Q2«  IZ2(1)     -    X2(  1)  )» 

L ( Z? ( 1 J  i )  ) 

•     III 

•     ■  -ill 

• 

<) 

•    in  | 

Ml 

■    ■  <HKJ      « 

1  '  •  i       | 
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X2TEM    =    FX21*X1(<)     +    FX2^*X2K)     +    FU*U(K)     +    R.J 
<♦    V    =    V    ♦     31*  ilM/liMli-MUMI     +    Q2* ( Z2 ( K+l )-X2TtM)« 

M  /'(>.♦  :  )  -  -  •  I )     +    R  *U ( K ) *U ( K I 

;     =    (V    -    T t ST ) /v 

i  •  r )  -  o .  o  l  j 

b    TEST    =    V 

INT     6.     V 
6    •  =lPt 15.4) 

DO     7     J 

<  •  F    4     1     -     J 

NL  IN 
F  ]     -     F  X  1 1  *  X 1  ( K  I     -     F  X  ]     •  »  2  <  K  ) 

.!•)  -i 

•  » I  ■  •      I  •  F  U ) 

PIF 

:  l 

•    i  *  p  ;  -('*])) 

i    •    •  ■  ■  •  i  i  i 

•  •  ■    •    i  i  >  ♦  i  j 

•  ■  :  ) 

•  | 

L  ( K I  .       •  •   •  •  '  !      • 

'  I  2(K  +  1  1  -  IP12JK  +  1  >« 

I     I         1«IP11IK+1|»F)  i     , 

... 
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H 1     ?  •     I  T  Y  P  E  i     1 1     . :  .    Q2 «     v  ,     iaIUI.     ■  •  ■   )  »    Z  2  (  K. )  i    U  (  K. )  » 

i    P 1  ]  •    Pl2(«.)«P22(K)»l      1  » ■  t  ) 

•      llnl.     ..  ••  .-.  rATt-VMWlAbLt  <»    2&HTHIS     IS    No 

ITY    TYPE  1 5  •  .     •  .     •  .2///15X.    VHFINA 

^L     V    = 1 1  .  (■  .  •     -  •  ,     \ m •  .   ,  ,  .        •  ,  .        (,     1HU» 


103 


SUBROUTINE:  NONLIN 

0)*  X2(1000),  ZKlOOOli  Z2(lPOO)»  U(10OO)» 
lXEKlOOOli  XE2(10CO)»  Pll(ioro),  P12(in00)»  P22(in0O) 
COMMON  XI  i  •  •   Li    .  Ui  XElt  XE2i  Plli  Pl2i  P22.  Fl»  F2i  FXH, 
1FX12i  FX21»  FX22.  FUi  R.  01 i  Q2i  K 
HI  =  X 1 ( K )     *  0 • 0 1 *X2 ( < ) 

•  IK)  -   .  .       F(X2(K) )*X2<M  +  0«01»U(K) 

■  '  1  =  : . 

<?1  «  -Oi 

:  .   -  0 .  -  )  ) 

•  01 
IF  (I         ... 
! ( I )  =  0.0 

I  (•  ) 
Rt 
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S  U  MON  L  1  N 

»    ■    ':        !»   £1(1000)  •    72(1000)«   uuonoii 

•  0)«    Pll(1000)(    P1?MO"0).    p,','iinnni 

•     •        .     •        »    PI  1*    P12«    Vi7  •     Fl i    H?,     FX11  . 

.    •      ,      •  ,      )1,      .     .     • 

■   )  ) 
•  .  (  •   )     >       .  •'■).. 

K  )    -        . 01 «X  1  (  K  )     +    0. 

•  J ) 

-     • 
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